Menai Insight
  • Home
  • Focus Areas
    • Managerial backgrounds
    • Committee activities
    • Performance evaluations
  • Articles
    • Theory
    • Our Approach
  • Help and Support
    • Using Menai Insight
    • Textual Structure Specifications
    • Manage Your Account
    • Contact Support
  • New Page

Embedding validation

Multiple stages of validation help ensure our accuracy

Note: This page is actively being worked on - parts may be incomplete.

We have embedded multiple stages of validation through our development process:
  • Qualitative Development of the Textual Structures: Ensuring that the textual structures reflect the underlying material
  • Validation of the Classification: Ensuring that the material populates the developed textual structures correctly
  • Manual Oversight and Face-validity: On-going checks to ensure that the overall population process is occurring as expected

Qualitative Development of the Textual Structures

The first stage in classifying the texts involved developing a representation of the material. While each sentence in a particular communication medium (e.g., managerial backgrounds) is unique, there is substantial underlying similarity in the material discussed. For example, managerial backgrounds, typically discuss positions that a manager has worked in, experiences that they have gained, and qualifications and professional licenses that they have received.   

​Each of the textual structures were developed with substantial qualitative consideration of the underlying material, drawing from research on themes (e.g., Glaser and Strauss, 1967; Ryan and Bernard, 2003) to help ensure that the textual structures reflect the underlying text.​ After first identifying the primary dimensions of the text, each of these sentences were then dissected again into the components.

Validation of the Classifications

The second stage in classifying the text is to ensure that the material is correctly populated into the textual structures. While this involved a combination of manual classification and machine-learned classification. Three primary approaches are used to ensure that the material is classified appropriately:
  • Validation by context: Ensuring that classifications are 

Validation by context
Checks that the context in which a term occurs in a sentence is appropriate; for example, while the concept sequencing PERSON_NAME IS MANAGEMENT_TITLE AT COMPANY_NAME is common, and appropriate, a concept sequencing such as PERSON_NAME RECEIVED COMPANY_NAME FROM UNIVERSITY_NAME is not common, and not likely to be correct (i.e., likely indicating that a degree acronym has incorrectly been classified as a company name). This validation includes three components:
  1. Manual checks to identify unlikely concept sequencing.
  2. Machine-learned identification, where classifications through machine-learning are inconsistent with the classified concept.
  3. Identification as concept sequencing that does not conform to that expected in the textual structures
This helps identify incorrectly classified concepts, irrespective of whether the concept has common terms (e.g., both concepts such as LOCATION and MANAGEMENT_TITLE), and in conjunction with validation through dissection and external checks, helps ensure the validity of concepts.

Validation by dissection
By dissecting concepts to underlying properties, and manually verifying the much reduced number of terms in the sub-concepts, and the sequencing of the sub-concepts, it is possible to validate a much larger number of terms. For example, the validity of concepts comprised of separate parts (e.g., MANAGEMENT_TITLE) can be assessed, despite there being tens of thousands of unique titles at the overall level.

Validation of terms through external-data-checks
By connecting terms to external databases, it is possible to verify concepts underpinned by a large number of labels, such as location information, that are unfeasible to manually verify, and lack the repetition in underlying words to allow dissection.
​
​

Manual Oversight and Face-validity

Checks throughout the process to ensure that the textual structures are being populated in-line with those developed.

Beyond documenting the textual structures, the examples, and summary statistics included in Appendix D illustrate that the textual structures, properties, and classifications have a high correspondence to what would be expected.

Our Focus

Managerial Backgrounds
Committee Oversight
Performance Evaluations

APPROACH

Overview
​Theoretical Possibilities 
Ensuring Accuracy
​
Validation

About

​Articles
​Our Story
​Developments
Contact Us
Help and Support

Picture
© 2019 Menai Insight, LLC      Terms, privacy, and other policies

  • Home
  • Focus Areas
    • Managerial backgrounds
    • Committee activities
    • Performance evaluations
  • Articles
    • Theory
    • Our Approach
  • Help and Support
    • Using Menai Insight
    • Textual Structure Specifications
    • Manage Your Account
    • Contact Support
  • New Page