During the last couple of years, there has been some brief bursts of content issues here and there, impacting search as well as content analysis. In a recent tribune for Sparksheet.com, Karyn Campbell (The IdeaList) took an interesting stand, saying whatever 3.0 looks like, better filters will play a big part. professional, human filters will play an integral role in the next web after all. I bet she has hollow nose !

Well, indeed, this makes sense and resonates with some other clues around there. 

Remember : two years ago, Yahoo! patented human intervention through a "human editor ranking system" in its engine. At that time, their point was that such a process obtained more refined results. The idea that, for qualitative results with high expectations concerning accuracy and preciseness, it is needed to have human experts in the game, well, this idea made its way. Better filters.

About one year later, one of the Pew Internet studies emphasized that :

Information overload is here, which means anyone with an interest in making sure their news reaches people has to pay close attention to how news now flows and to the production and usage of better filters.

Better filters, again ! In a march 2010 Researcher's tribune by Martin Hayward, some ideas bring water to our mill :

the real stars will be those who can make sense of, and draw insight from, vast amounts of data quickly and reliably. we have to move from being an industry where value was derived from providing scarce information, to one where value is derived from connecting and interpreting the vast amounts of infomation available, to help clients make better business decisions faster

What could this mean for content analysis now, which has a foot in search issues and the other in qualitative content analysis and curation issues ? More specifically, what would this mean for the business applications of content analysis, such as trend monitoring solutions, sentiment analysis and other types of applications dealing with one of the biggest amount of information available - say User Generated Content from the social media areas of the web ?

Back in 2009, Asi Sharabi made a realistic but critical portrait of social media monitoring solutions. The systems may have improved by now, but several raised issues still are more relevant than ever :
  • "Unreliable data" : where do the most part of your brand's mentions come from ? is there any feature allowing you to make a distinction between spam messages, deceptive reviews and the spontaneous conversational material you'd like to meaningfully draw insights from ? Rhetoric question, of course there's not such a feature.
  • "Sentiment analysis is flawed" : even if there is progress on the subject, the idea that fully-automated systems are costly to set up, train and adapt from a domain to another has also made its way, which benefits to a different approach : defining a methodology where the software and the analyst collaborate to get over the noise and deliver accurate analysis.
  • "Time consuming" : Asi Sharabi put it well, saying it may take "hours and days" to accurately configure a dashboard. Is this time-consuming step a proper and adequate one to put on any end-user working in a social media, communication or marketing department ?  As suggested by the author, at some point, it would be more profitable for the client to pay an analyst to do the job.
No, unfortunately, the situation has not tremendously evolved since then. Just ask some social media analysts dealing with dashboards and qualitative insight to provide well maybe I attract the bad tempered ones a lot. So, what can be said after that ? 
A few more words. Making faster but accurate and congruent business decisions and recommandations using content analysis solutions is not the core of the problem. The core of the problem more likely lies in setting up an appropriate workflow, with a single main idea : expert systems need experts, and they need them upstream and downstream of the data analysis process. Data scientists skills are without any doubt one of the keys to a "better filtering" of content, to provide, curate and analyse real qualitative content.