AutoClassifier modernizes the tagging process with an integrated AI-driven and rules-based approach that provides taxonomy management and entity recognition/extraction with machine learning and Cognitive Services from Microsoft and Google to power metadata creation, document summaries, image OCR, audio transcripts, and more.

AutoClassifier removes the burden of tagging from users when generating new content. It also tags existing content accurately and consistently to help users find relevant information quickly.

When coupled with BA Insight’s SmartHub, the data intelligence created by AutoClassifier enables hyper accurate search results; personalized, predictive and proactive delivery of content to users; and a complete framework for data segmentation and analysis across all enterprise data.


Key Capabilities:

AI and Machine Learning

AutoClassifier brings advanced capabilities that help users find information faster and surface automated intelligence about all content.  Its primary AI integration points are around Natural Language Processing and Multimedia Analysis.

  • AutoClassifier’s Natural Language Processing capabilities include automatic creation of summarizations of key documents, identification of similar documents, and extraction of concepts found within the document.
  • AutoClassifier’s image and video analysis capabilities include extraction of all text that appears with an image, extraction of all speech within a video, returning a text searchable transcript, identification of signatures within a document, extraction of images/videos from within documents, and identification of objects, locations, and activities captured with the images and video.
  • In situations where companies require more finite control to meet specific needs, AutoClassifier’s rules-based tagging is utilized, which assigns concepts from a taxonomy or ontology to content automatically, through customer-defined rules. Rules can be automatically generated and also provide very precise control. For example, content can be tagged when it matches particular concepts and also has specified metadata. Matching can be hierarchical to disambiguate terms that may have different meanings in different contexts.

Entity Extraction

AutoClassifier recognizes provided terms, phrases, and regular expressions within content and assigns them to metadata properties. You can extract, for example, part numbers, project names, or customer names from a document. This can also be used for detection of PII, GDPR compliance, or for similar compliance and content auditing applications.

Smart Tagging with Scoring and Weighting

Uses a rules-based approach with a powerful yet familiar full-text query language complete with Boolean, proximity, scoring, weighting, and fielded search capabilities. There are no ‘black box’ algorithms, so you can understand and control how content is enriched and classified. You are provided finite control over the returned metadata and are able to enforce rules around quality of hit vs. quantity of hit, ensuring that only the very best metadata for the document is returned.  Rule generation is scriptable and starts with intelligent defaults, minimizing the effort needed to maintain rules. A Test Bench lets you preview categorization results in real time against your documents.

DataSet Connectors, APIs and Pipelines

You can connect to external systems or processes and enrich content by adding relevant metadata and/or normalizing terms. Further enhance these external connections through rules and steps, allowing the simplest or most complex scenarios to be supported. For example, people names can be matched to a master directory using a fuzzy match so that misspellings can be cleaned up and different name formats can be normalized. Custom dataset connectors can be built to any enrichment process, such as, for example, domain-specific processing to recognize chemical names, protein sequences, etc.  These connections can be sequenced, with the output from one connection triggering further processing and analysis.


Supports complex content and metadata gathering using familiar VBscript. You can solve the trickiest problems most demanding scenarios; modify content, metadata, and mappings in any way desired, and combine multiple metadata fields together using scripting.

Taxonomy Management

Create and modify multiple taxonomies or ontologies, with drag-and-drop simplicity for rearranging categories and editing category rules. Taxonomy information is stored and managed in a shareable format, while also being enhanced with new features to allow auto-tagging. You can import and export taxonomies from industry formats such as SKOS, RDF, CSV, and the SharePoint term store interchange format, so you have the flexibility to use other taxonomy and ontology tools in combination with the BAI Software Portfolio.

Plug-In Framework

AutoClassifier is designed from the ground up to support the processing of documents stored in many core enterprise systems and can be extended to store the created metadata within that system.  The current available plug-ins support SharePoint On-premiseSharePoint Online, and ClarityNow.  Additional plug-ins are currently being developed.

View Resource Library