BONOBO LAND: Distributed Content Analysis

Wednesday, January 07, 2004

Distributed Content Analysis

Remember my endless talking about integrating the warehouses of brains, content analysis etc? No, well never mind. What matters is that I've worked out a very primitive first step, only in a very exploratory form, and I wouldn't mind some feedback on it.

First, a bit of background. According to texts like the CIA's "The Psychology of Intelligence Analysis" and stuff I've read on cognitive psychology, one of the main problems when coping with information is that we tend to haveonly a single picture or model in our heads of what's going on in a certainmoment, and we evaluate the information we receive against this main hypothesis.

The problem inherent with this is that information can be perfectly consistent with a very unlikely hypothesis. To use a somewhat whimsical example, suppose that I believe the CIA is using winning lotto numbers as a secret code totrasmit orders to agents in Amsterdam. It's clear that there will be little or nothing I could observe that would disimburse me of this notion, although all this evidence would also support [and with greater likelihood] the hypothesis that it's just a game of chance untainted by intelligence forces [this isn'ta good exaplanation, I know].

Because of this, the CIA recommends its analysts to use some form of matricial analysis: make up a table where the columns are hypothesis (as many and diverse as you can generate about a given situation) and the rows are items of evidence. In each cell of the table you mark whether the evidence is
compatible or incompatible with the hypothesis, always trying to eliminatehypothesis by identifying incompatible evidence instead of finding supportingevidence, which is always unconclusive (cf my paranoid lotto fantasy)[as good as explanation as any other of the scientific method].

My idea is to use this analytical method in a distributed, decentralized way, using the infrastructure of blogs and aggregators with the minimumof interference. I came up with this: Suppose I blog about US troop movements in Iraq's border with Iran. There are various hypothesis relevant to this: The US is going to invade Iran [suggested a week ago in post http://blogger_a/invasion], or maybe the US is threatening to invade Iran [http://blogger_b/threat predicted this a while ago] or maybe the US will never-ever put a soldier in Iran's frontier [http://blogger_c/never]. As things stand now, I would blog the news, comment it with my own hypothesis, and that would be it. Hardly synergetic.

But I could add the following line to my post:

<"wob:::"post's permalink" http://blogger_a/invasion:::+ ">
http;//blogger_b/threat:::+
http://blogger_c/never:::-! .

Most aggregators, browsers and such will ignore this, but I'm writing a little modification to an aggregator that reads this line when parsing the feed and interpretates it as:

This post says that it supports the models in
http://blogger_a/invasion
http://blogger_b/threat and, that its evidence is incompatible with the
hypothesis in http://blogger_c/never.

The aggregator would *aggregate* a lot of these annotations, and then come up with a summary like:

The following hypothesis:.... http://blogger_c/never... have been
discredited by a post [with a list of posts discrediting each hypothesis]. Post that support hypothesis http://blogger_a/invasion : ...,
,
etc.

This works both as a guide and as a summary for posts. One could gather a bunch of hypothesis about a certain topic [eg, the war in Iraq], and then check each day what news support or discredit a given hypothesis --- in short, gain a clearer and smarter picture of what can be logically inferred from the mass of posts that he/she could do even if he/she read them all. In fact, the aggreagator is putting together the smarts of all posters and synthetizing them in a summary view...

Besides the obvious advantages [possibly a quantum leap on how much insight you can gain in X hours of reading your aggregator], this particular method of doing it has the advantage of being doable with today's infrastructure [no need to modify weblogs, only adding, even by hand, the wob lines] and
distributed [it can be implemented in any number of weblogs, you can aggregate from any number of feeds, posts can have or not the annotation w/o any problem].

I'm planning to use this to annotate/organize my posts from now on [heck, even the discipline of having to explicitate hypothesis and a post's bearing on them probably does much to raise blogging quality] --- the point is, does anyone think this might be something that could interest other bloggers? Does anyone see it as potentially useful? Could it be improved? Does anyone have an idea for a better syntax for the annotation? Another way of structuring the information? Or is the 'net just not ready at the moment for distributed evidence analysis? Your thoughts please.

BONOBO LAND

Facebook Blogging

Wednesday, January 07, 2004

Distributed Content Analysis

No comments:

¿Adiós A La Crisis?

Edward Hugh's Facebook Page

Bonobos that write, start fires and play Pac-Man

Blog Home Page

RSS Feed

Global Economy

Emerging Economies

Eurozone

Eastern Europe

Asia

Latin America

Demography

My Best Posts

Eastern Europe Economy Watch

United States

Edward's Website

About

Edward's "Other" Blogs

Latvia Economy Watch

Claus Vistesen East European Posts

Alpha.Sources blog

Hungary Economy Watch

Japan Economy Watch

Special Mention

Economics Blogs

Friends

Economic Links

Central Banks Etc

Euro Watch

Think Tanks

demography.matters.blog

Italian Economy Watch

Site Statistics

E-Mail Me

Blog Archive