Crawling the web, Harnessing the power of Nutch with Scala

Nutch is a very powerful, open source webcrawler written in Java. Apache Nutch can run very large crawls in parallel, downloading, indexing, and archiving millions of pages. In this talk we understand key architectural details about Nutch. We would see how it is easy to extend the Nutch behavior with Scala plugins.

[…]

Building Massively Scalable Applications with Akka

Historically writing correct concurrent, scalable and fault-tolerant applications has been very hard. Akka is an attempt to simplify writing concurrent, scalable and highly available software for the JVM. Akka has an API both for Scala and Java. Akka uses the Actor Model together with Software Transactional Memory (STM) to raise the abstraction level. For fault-tolerance […]