NCSC Highlights Redaction Technology Advancements
Last month the National Center for State Courts (NCSC) coordinated an Automated Redaction Proof of Concept (PoC) with several vendors.
The PoC was made possible thanks to funding from the State Justice Institute (SJI). The purpose was to provide accuracy benchmarks to courts considering implementing automated redaction technology. NCSC is planning to issue results prior to CTC, publish on their website, and present to various target groups and at conferences.
The court access/public privacy workgroup identified a need to give courts more information as they consider making documents available online. Specifically, NCSC wanted to educate courts on advancements in redaction technology over the last ten years to significantly reduce the amount of manual effort required by courts.
Participating vendors were asked to identify, and redact, roughly 21 different types of sensitive information.
Commonly redacted fields like SSN, DLN, DOB, and Financial Accounts Numbers along with more advanced fields like names, and contact information, for juveniles, crime victims, jurors, confidential informants, and law enforcement officers. Vendors were given a preliminary set of documents to train their software to identify the desired fields followed by a new set of documents to see the results of automated technology alone. While we are still waiting for the final accuracy range for participating vendors from NCSC – our unofficial results were software was nearly 100% accurate on the fields our customers most typically choose to redact, and more than 91% accurate overall for all 21 fields. These results were based on zero human verification.
Machine learning has significantly reduced the amount of verification required for courts to achieve a nearly 100% final capture rate. For one recent project we were able to reduce verification by more than 60% while still maintaining the required 99.5% accuracy. Technology is able to do so by more accurately identifying which documents, or specific pages, are likely to require court staff to manually review. It is likely results for the NCSC PoC could be improved with a larger training set.
NCSC hopes the results will show courts that advancements in technology have significantly reduced the barriers to public access.
The most expensive part of any redaction project is often the people involved (if manual verification is required) – we believe the results of the PoC will demonstrate that should no longer by an obstacle.
Another comment the court access/public privacy workgroup heard was automated redaction can still be cost prohibitive, but if we can leverage our OCR capabilities to eliminate data entry, classify documents, or identify specific verbiage in documents and conditionally route to certain individuals, it would increase the value proposition for the courts. Here is a link to a recent webinar highlighting ways a few agencies in the justice community are eliminating manual workflows to reduce data entry:
To see our technology for yourself or talk about a manual process you would like to automate in your court or office, please reach out.
If you are attending CTC in Salt Lake City September 12th through 14th please stop by booth #320 to talk in person.