I apologize up front. This is a post that only those who handle large volumes of high school or college transcripts will find interesting. But given the feedback I got the last time I talked about it, I think those people will find it well worth reading.
I have seen a product that makes me think that I may already have been proven wrong about AI and its ability to decipher college transcripts. This product demonstration and an in-depth talk with the technical team makes me think they may have overcome some of the challenges I brought up in a previous blog post. I’ve only seen five or six examples so far and before saying the system works well, I’d want to put it through a rigorous test with thousands of transcripts with known values to test. But what I saw today was impressive and gave me hope.
As I wrote in a blog post at my previous employer, the two main problems we encountered during substantial research were: 1) Reading the Text properly and 2) Understanding grading periods. I will briefly recap these issues here.
1) As I pointed out in that blog post, people assume that sending transcripts between schools is done electronically as data. This couldn’t be further from the truth. In the past few years, most ARE sent electronically, but they are sent as pictures, not as data. This would be like someone sending you a picture of this post instead of the post itself. Perfectly fine if you want to read it, but not so much if you want to cut and paste from it. To enter it into a computer system you would have to retype it.
To get around this issue, many systems use Optical Character Recognition (OCR) which is a way of turning the photo of text into actual data. Unfortunately, OCR is by no means perfect and is especially problematic if the image is low resolution or there are issues such as different color schemes. Unfortunately, for recognition and security reasons almost all schools’ transcripts have large watermarks that often run over text, which can make it difficult for humans to decipher and even harder for OCR.
OCR performance is over 98%, which is to say it gets 98 out of 100 characters correct. That may seem like a low error rate but let’s run the numbers. The fake transcript snippet with this blog post has about 500 characters. That would mean it has 10 mistakes on it. Given that this snippet is about 1/3rd of a standard transcript, that would be 30 mistakes per transcript. That seems like a high number. If I had 30 typos on this page, you’d sure notice!
Of course, it depends on where the mistake is and what the system gets wrong as to how much it’s a problem. For instance, it’s much less serious for the system to see ‘Span1sh’ as ‘Spanish” than it would be to mistake a “B” in a course for a “D”.
That said, I believe that these mistakes could be drastically minimized by some simple system double checks. A standard spell checker would handle the “Span1sh” mistake and, since most US transcripts have both a grade and a grade point with it, it’s a simple matter to flag the transcript for review if the grade and associated grade point doesn’t match. In my experience, the pluses and minuses after grades are especially prone to mistakes but matching them with grade points should allow them to reliably be flagged for human review.
2) The stickier issue in my experience is the grading period, which is to say which year and semester the grade was earned in. Typically, grading periods are noted directly above the corresponding grades, but there can be variations. For instance, sometimes the periods are enclosed in boxes, or months might be used instead of semesters. Additionally, there can be a significant amount of text between the date and the grades. Transcripts vary widely; some feature two, three, or even four columns packed with grades, while others list just one course per line.
On top of that, there’s a LOT more than just grades on many transcripts. Transfer credits, grade point averages, honors, standardized test scores, degree information, not to mention the standard biographical information, are all on there.
The point is, figuring out what is or is not a grade is tough enough; associating them with their grading period is even tougher.
There might be hope!
I met for an hour with John Reese1, CEO of MyDocs, and his two leading technical people. John reached out to me because he had read my previous blog post and wanted to know more about the problems we had encountered. Over that hour they walked me through what they’d done in detail2 and showed me the results.
Their initial OCR results seemed to be about 98% correct, which means pages were littered with possible mistakes. That said, they had a good system set up where they had the original and the OCR’d results side by side so a human could easily review and fix them. It was clear that most mistakes were things like the system not being able to tell if a course that said “Calculus I” was a 1 or an I.
We talked about some ways to make the OCR better but, frankly, I was unconcerned by what I saw. They had a good review system for someone to double check anything flagged and I’m confident that a few simple checks, such as the system knowing that it’s typically “Calculus I”, will drop the work to double check these dramatically.
On the grading period front, they seemed to do an even better job. Of the six transcripts I reviewed, I only noticed two errors (although that same error might happen multiple times). This was a heck of a lot better than we’d been able to accomplish at my previous employer. I was impressed.
In addition, the technology people really knew their stuff. I am enough of an AI expert3 to see that they knew what they were doing, even if they wouldn’t let me under the hood quite as far as I’d have preferred.
In the spirit of full transparency, I had never heard of MyDocs before last week, had never met John Reese before, and have no business or financial ties of any kind with MyDocs. Further, although the results in a one-hour tech walk-through were impressive, I am withholding full judgement until I see the results from at least a thousand transcripts the system has never seen before from random schools processed and then the resulting data compared to current methods. From what I’ve seen, I hold out hope that those results will be highly positive, but the proof of the pudding is in the eating and full judgement must wait.
One thing I think they will struggle with more than they realize is trying to standardize grading systems. Even in the US, it’s far more complex than most people realize, and overseas is an even bigger hurdle. For instance, in the US at some colleges an A+ is a 4.25, at some a 4.33, and at others a 4.0. Colleges usually publish the grading scales on the back of their transcripts.
Maybe one day AI will be able to handle grading scales as well, but that seems like an even tougher challenge!
- To my great pleasure, I found out that John had been the head of Parchment, the transcript company, roughly two decades ago, when they had figured out that getting universities to upgrade their systems to allow sending transcripts of data was unlikely to happen anytime soon (and still isn’t!). They came up with the ingenious idea to have registrars send the transcripts to a print driver which, instead of printing the transcript, took the image that would have been printed and sent it to Parchment with the metadata needed to send it further. This may be difficult to conceptualize, but I assure you that if not for that genius idea we would all still be sending most transcripts via the US Postal Service.
- I admit that as a very hands-on person I was aching for even more detail, especially around which Generative AI models they used and how they used them.
- My recent foray into a deep dive on creating RAGs for specific purposes has taught me that (1) I am quite knowledgeable about AI and (2) there is still a gigantic amount to learn and a lot of people way smarter than I am.
