Our methodology


Most of the Syrian Program products and services in the Observatory for Political and Economic Networks that seek to understand complex systems are based on building network knowledge bases (knowledge graphs) from published sources. In general, the construction process is carried out in three stages:

  1. Extraction: Data sources can vary in reliability. According to the source type, the collected data are given a "reliability weight."
  2. Processing: Researchers collect or update data at specified intervals, and then align them with the data model and predefined classifications.
  3. Weaving: Classified and tagged data with high reliability are used as inputs in producing research products and observatory dashboards, as well as network, geographic, and temporal analytics to serve six different types of partners.


Extraction

We rely on many sources for building network knowledge bases. The process of data collection is done both manually and automatically, and then stored after primary processing.

The Fact-Checking process consists of five stages:

  1. Provenance and Origin Verification: The primary source of data exists
  2. Objectivity: The author of the data is known and objective
  3. Reliability: The data have been reused to build further investigations
  4. Content Cross-Check: The content can be examined from a variety of independent sources
  5. Credibility Cross-Check: Verify that supporting sources are reliable

Each data source has its own level of reliability in terms of providing valid information; we embed that reliability in each relationship added to our network knowledge base using a “confidence weight index” score ranging from 0–100%.

For example: An ownership connection between a company and an individual that we find on that company’s official website or using documents from the Ministry of Internal Trade and Consumer Protection would be added to our network knowledge base with a confidence weight score of 100%. The same connection found on an unverified Facebook page would be mentioned in our database, but would be tagged with only a 20% confidence weight score and would be monitored for any indications of validation or discrediting. In this way the information is never lost, and we work constantly to verify it, but any user of the interactive review tool in the network knowledge base can query for results with a high confidence weight and never see the least reliable data.

We rely on sources such as death records and leaked data search engines to monitor kinship and intermarriage, as well as personal photos on social media and public photos in magazines to infer friendships and other forms of affection or cultural ties (especially on holidays and other special occasions). We also use open-source intelligence technologies (OSINT) to collect additional data. We did not stop at the geographical framework within the borders of Syria, as we extend our monitoring of relations of all kinds to include both the Arab region and the world.

Processing

Knowledge is cumulative. Network knowledge bases deal with accumulation by defining a logical data model describing entities and the relationships between them, then linking and storing them in Graph Databases (which are different from Relational Databases).

We designed the following Logical Data Model to meet the needs of the Observatory for Political and Economic Networks (Syria Program):




By linking this model to online databases, it allows us to:

  1. Link entities (oval shapes in the model above) together across different types of functional relationships, with both entities and relationships having properties that describe them (light gray lines in the model)
  2. Accumulate developments and updates on entities and relationships, labeling them with timestamps and geolocations.
  3. Offer a network search and query feature through direct and indirect relationships between entities
  4. Build algorithms that explore patterns in the network of relationships that bring entities and relationships together, helping to:
    1. explore the shortest path between entities
    2. explore networked sub-communities with unique characteristics (Community Detection)
    3. calculate levels of influence and importance according to Influence and Centrality algorithmic metrics
    4. calculate the resilience and resistance of sub-networks and conduct comparative analysis between them
  5. Carry out various types of temporal and geographic analyses, such as scenario analysis, what-if analysis, and simulate-and-predict analysis
  6. Enable the integration of various logical data models with each other, including those at different levels of detail (for example, integrating less detailed models for the purpose of applying systems-thinking analytics with more detailed models designed for the inclusion of personal networks).

Weaving

Based on the questions raised by our partners, we determine the data boundaries and incorporate the ties between various data points. We then propose the appropriate analytical approach and the ultimate product or service.