Tuesday 14 June 2016

Research Capabilities 001

Research is a key area which undergirds most other operational areas currently proposed. We should probably try to get a clearer idea of what this means for our design of research systems.

In no particular order, here are some speculative ideas on the topic which may (or may not) be useful as jumping-off points.

David Pavett:
At this stage I think the problem is to design a structure for online debate which is able to help us to produce good quality policy materials and good quality political education materials (God knows the Labour Party needs them!). That needs to be worked first. Implementation, including platforms and programming should after that when we have a clearvidea what we would like to achieve and how, in general terms, would like to achieve it. We need a paper or two on this to take this discussion to the next stage.
It may be that items of research become the currency of our systems. Certainly as David's comment suggests, we need to be thinking about what the system produces first and foremost - and it may be that by taking a 'research-item-centric' approach we can help to focus discussion and drive decision-making. The Wiki system is the obvious example: having a concrete document at the centre of debate focusses attention, allows newcomers to quickly see the current state of play and keeps discussion orientated toward results.

Some possible downsides:
  1. Wikipedia, at least, is weakest at the most contentious and tricky part - the talk page.
  2. Having a status quo situation may tend to narrow peoples' focus too much and inhibit radical ideas
  3. People may become defensive of the work they have already invested in, and excessively harden their positions
  4. A Wikipedia page is relatively unstructured. My impression is that Wiki pages can be altered largely independently of other related pages. This isn't really compatible with trying to maintain a well-ordered repository of structured data. In particular it nmay cause problems with auditing chains of sources and of inferences.
For any reputation system or even, should it be desirable, some kind of 'seniority' concept, contributions to research, perhaps weighted in some way by a usefulness score, might well play a key part. Informally, Wikipedia works in some such way - not that this is a reecommndation, since other things equal, 'pulling rank' isn't conducive to good working relationships, especially when the rank being pulled isn't clearly always earned and can seem rather cliquish - this is partly a result of the very informality of the de-facto pecking order.

However, it is necessary to have some kind of ranking systems for pieces of research - how much confidence can be placed in it, etc - and so far as some people are likely to need to take up 'senior' roles such as editing, dispute resolution, assessment of contributions, we need some tractable, transparent way to determine which people are best suited to such tasks. There is also the question of using contributions to reserach (and aother areas) as a measure of seriouslness anbd dedication, much as door-knocking an leafletting has tended to be used in the party in the past (not that this is necessarily any recommendation). On the other hand, maybe the idea of ranking contributors is unnecessary, in which case it should be avoided.

This topic overlaps with the Collaborative Working thread - discussion specifically concerning to the mechanics of collaboration can profitably be located there, but will need to take into account any conclusions we're able to reach in this wider discussion.

A key topic that may benefit from being addressed here before a seperate thread is spawned under the 'Systems' section is how we store and organise the output of research efforts.

I suggest that 'research' covers anything from selecting, archiving and recording the url of a web page or submitting a properly cited scan from a print book, to producing a substantial piece of original investigation or an extended essay. FIr these to be maximally useful, they need to have sufficient metadata to allow then to be searched and re-used. Storage may be best done at the most granular level, with larger units of research such as an extended paper being stored as defined aggregates of their smaller constituents.

Can we use machine methods to classsify and document metadata such as subject area, type of information, etc.? Could such methods operate 'on the fly' so as to obviate the need for storing much metadata permanently? I think this is unlikely - fuzzy algorithms can get pretty fuzzy at times ub my experience, and the relatively minor discipline of maintaining human-checked metadata seems a small price to pay for the certainty, stability and sanity-checking that it provides. Or maybe this is all completely wrong.

One suggestion that I think is important is the idea of a chain of sources. This is of course a basic concept in any field of research or investigation. It may not always be easy or even important to document specific conclusions of a piece of research: its 'output' may not be easily defined. Its inputs however, are - or should be. Any item of research should name its sources. We might consider a model in which only sources held within the research repository can be cited - anyone submitting a item of research must also submit their sources, preferably going back to 'primary' sources - sources which do not need (or have) further sources. How 'primary source' is defined and how such sources are to be stored is a further issue. The idea here, in any case, would be to maintain the integrity - and defensibility - of our research by forcing people to properly source all work.

Another aspect is how, if at all, external sources of data are to be made available to researchers - maybe we have a research portal which makes it easy to locate various types of information from canonical sources. This might in turn partly automate the process of providing and storing source data. If at some stage the project attracts funding, it may be possible to incorporate institutional subscriptions to sources such as JSTOR and other journal collections, as well as press cuttings services of the kind available from. e.g., LexisNexis.

Research is needed into the technologies available for this kind of thing. Some kind of triple store seems vaguely as if it might be suitable; there is a lot of buzzwordy stuff around about the 'semantic web' etc but not sure how meaty any of it actually is. This does seem something that it is important to get right early on.

No comments: