CV as code: scripting to satisfy AI and (with any luck) human hirers

It was time to update my CV (resumé to North American readers) anyway, and I figured, why not just script the process? Over the many years of my professional life, my CV has existed as finely laid-out LocoScript, Microsoft Word, QuarkXPress, Adobe InDesign and ultimately Apple Pages documents all used as to create printed and, more recently, PDF copies of my Curriculum Vitae for distribution.

But the world has changed — boy has it changed — and now CVs need to appeal to AI scanners, not people. Register journalist Dominic Connor put me onto the case, noting that the old rules of one to two sides of A4 at most, and all the content summarised as tightly as possible to make it easy for tired hiring managers to glean what they need, have gone the way of the dinosaurs in these LLM-mediated times. So out goes a stylish CV typeset in a tasteful multi-column layout arranged to appeal to humans, and in comes one that’s plain but way better suited to online PDF parsers and bots.

The first step was one I’d been thinking about anyway: writing out my CV in Yaml form. Why Yaml? It’s a little more reader-friendly than Json, and it isn’t hard to convert to Json with Python’s internal Json library and a third-party library to parse the Yaml. I used PyYaml. So that’s the two key machine-readable formats sorted out.

Except, of course, I’ve yet to see a job application site which, while allowing you to upload your CV and have data extracted from it, will accept .json or .yml files. Microsoft Word .doc and .docx files, yes; PDFs too; even .txt files, but not the others.

So I wondered how might I morph the Yaml source, like that below, straight to PDF.

Name: Tony Smith
References: Available on application
Links:
  - LinkedIn: https://www.linkedin.com/in/smittytone/
    Website: https://smittytone.net
    GitHub: https://github.com/smittytone
    Blog: https://blog.smittytone.net
    Side Hustle: https://merch.smittytone.net/
Key IT Skills:
  - Programming:
      - Swift and Objective-C on macOS (appKit, SwiftUI), iOS/iPadOS (UIKit, SwiftUI), with Xcode GUI/CLI tooling.
      - Swift on linux.
      - C and C++ embedded with GCC, GDB and CMake.
      - Go for command-line tools on Linux and macOS.
      - TypeScript/JavaScript/Node.js for command-line tools on Linux and macOS.
      - Python programming on Linux and macOS; MicroPython and CircuitPython for embedded.

The obvious route is via Markdown, and it’s not hard to write some Python functions to do the job. PyYaml nicely converts the source material to a nested set of Python collection objects, and then it’s just a matter of iterating over them to convert keys into headlines and values into either textual information or, in the case of nested collections, recurse through them until you get to the ultimate scalar values. This is what you get from the above:

# Curriculum Vitae — Tony Smith
## References
Available on application
## Links
* [LinkedIn](https://www.linkedin.com/in/smittytone/)
* [Website](https://smittytone.net)
* [GitHub](https://github.com/smittytone)
* [Blog](https://blog.smittytone.net)
* [Side Hustle](https://merch.smittytone.net/)
******
## Key IT Skills
### Programming
* Swift and Objective-C on macOS (appKit, SwiftUI), iOS/iPadOS (UIKit, SwiftUI), with Xcode GUI/CLI tooling.
* Swift on linux.
* C and C++ embedded with GCC, GDB and CMake.
* Go for command-line tools on Linux and macOS.
* TypeScript/JavaScript/Node.js for command-line tools on Linux and macOS.
* Python programming on Linux and macOS; MicroPython and CircuitPython for embedded.

My script isn’t entirely generic: it’s in the nature of the CV that nested elements in one section aren’t always going to be formatted exactly the same way as are those in another section, even assuming they have the same keys. So the script has to do some key checking to adapt its output accordingly. I also put in some very fine spacer rules to help with human readability, and you don’t want to drop them in ahead of every single key of a particular set of characters.

I wanted to have as little raw Markdown in the original data as possible, to keep the source data clean and avoid polluting the Json output but also because that’s the script’s job. In fact, I plan to extend it with arbitrary rules to handle this kind of thing.

Yaml to markdown conversion is straightforward — Yaml to Markdown is pretty straightforward, but quirks may be necessary

The next phase is Markdown to PDF conversion. Enter Markdown-pdf, a Python library that combines a Markdown-to-HTML library (Markdown-it-py) with one that outputs HTML to PDF (PyMuPDF) together via a simple API. Bingo.

Markdown-pdf exposes some of the underlying libs’ functionality, so I could tweak the page margins, for example, and more importantly the fonts to be used. That’s done through by passing in custom CSS. Now, I use SCSS to compose stylesheets during deployment of my website, so I also looked for a Python SCSS parser. I found libsass-python and imported it in. So, load in an SCSS file, convert it to CSS and pass the result to Markdown-pdf.

This isn’t the only way to achieve this, of course. Another is to convert the original Yaml to TeX formatting and use some like Pandoc to handle the PDF output. To be fair, Pandoc can probably be used as the sole tool, and it I ever decide I want Word output too, I’ll likely go down than route. But where’s the fun in handing all the work over to an existing tool? I wanted to work on the Yaml-to-Markdown bit myself, which I why I went the way I did.

Now you can run the script, which operates as a command line tool so you can specify what .yml and .scss files you want to use, and you’re away. Well, I was, the proof being the much better parsing of the generated PDF when I uploaded it to a handful of job applications.

There’s no guarantee, of course, it’ll make any difference to my success rate — not high to begin with; call me Yosser… — but at least I can see the data’s being pulled out of it correctly, and it’s still readable on the off-chance a human being takes a look.

Hopefully, the AIs out there will give it the thumbs-up too. I could sure do with a break…

For those who remember, LocoScript, developed by Locomotive Software, was the word processing software bundled with the Amstrad PCW8256.

smittytone messes with micros

BULLETINS FROM THE TECHNOLOGY FRONT LINE — AND OCCASIONALLY BEHIND IT

CV as code: scripting to satisfy AI and (with any luck) human hirers

Rate this:

Share this:

Related