Thank you to everyone who came along to our “Data Mining 2.0” TechDrop last Wednesday; a look at the data mining tools and technology used by the likes of FacebookTwitter, and Yahoo.

The world is seemingly drowning in data – and surfacing meaningful information is getting harder and harder. Some mind boggling stats from this session include;

  • Twitter generates ~7 TB data for analysis daily
  • FaceBook adds ~12 TB of compressed data daily
  • Yahoo! 4000 node cluster, sorted 1TB random integers in 62 seconds

This TechDrop looked at open systems tools and technology that are leveraged by these organisations to sort and analyse large amounts of data including HadoopHivePig and R. The session also explained how the various technologies implement and build on the MapReduce algorithm.

For our part, ClearPoint is engaged with many clients on Information Management and Business Intelligence related projects including our work with the New Zealand Superannuation Fund and Anaplan. Rather than a conventional BI product-led or technology-led approach, ClearPoint tends to take a software methodology approach to information ensuring a focus on business needs and drivers. We are also interested in mega trends (Pro-consumer, Big Data, Web platform businesses), how they influence and inform traditional IT thinking.  For example; Microsoft moving to offer support for SQL-Hadoop integration.

A big thanks to Jonathan Ackerman who ran the session; the slides are available up here.

We welcome any feedback, ideas – and look forward to seeing everyone all next time.