problem about crawler4j-Collection of common programming errors


  • Amit
    java runtime web-crawler crawler4j
    I am using crawler4j and I need to add links at runtime. Let say , I add a seed ‘LinkA’ and crawler4j started crawling it. While program is running , I want to add one more seed ‘LinkB’. Can it be done ? if yes , how ?Thanks in advance.

  • Amit
    java crawler crawler4j
    In crawler4j we can override a function boolean shouldVisit(WebUrl url) and control whether that particular url should be allowed to be crawled by returning ‘true’ and ‘false’. But can we add URL(s) at runtime ? if yes , what are ways to do that ? Currently I can add URL(s) at beginning of program using addSeed(String url) function before the start(BasicCrawler.class, numberOfCrawlers) in CrawlController class and if I try to add new url using addSeed(String url), it gives error. Here is error i

  • Alireza Noori
    java multithreading crawler crawler4j
    I have created a custom crawler using crawler4j. In my app, I create a lot of controllers and after a while, the number of threads in the system will hit the maximum value and the JVM will throw an Exception. Even though I call ShutDown() on the controller, and set it as null and call System.gc(), the threads in my app remain open and the app will crash.I used the jvisualvm.exe (Java VisualVM) and saw that at one point my app hits 931 threads.Is there a way I can immediately kill all the threads