Tuesday, May 14, 2024

Project 24: Blog HTML Generator - Introduction

 I’ve been having a lot of fun with my new book tracking infographic. Doing things like making a word cloud of authors and a grid of titles is just interesting enough that it’s a great job to have on hand when I don’t want to do anything that needs more thinking. And more, or less, I like the look of what I’ve put together. I’m sure I could make it more … infographicy … but for now I still like it and that’s all been an easy amount of tracking.

What frustrates me (or at least leaves me squinting over a screen) is that I like the look for two columns of data for books and authors and other things, but it is hard to manually transcribe and update that information correctly. While I’m enjoying making some parts of the infographic by hand there’s no reason not to automate the parts I don’t want to do by hand.

The data that I track for books isn’t terribly complicated, and my display isn’t all that complicated either. My first thought was that I could probably just build a table manually in Python. Then I started looking around and while Python has an HTML library that I could possibly use I also discovered Jinja, which is a template generator you can call from Python and which lets you build structured text with your own data added in.

I thought Jinja looked fun, possibly having old timey Mail Merge nostalgia, and I really enjoyed the bit where it just takes a template and adds your data where you put in a marker. It has some fairly functional concepts of conditionals and loops and didn’t rely on much else. You can use it for a bunch of different things, and it’s probably overkill for me, but I’m a sucker for accessing data in double braces. {{ humour }}

Getting Set Up

Documentation for Jinja is a little spotty, but not too bad. The big thing to keep in mind is that you need to install Jinja2 using pip. (Also then that you have to call it as jinja2 in code. I found the Geeks for Geeks tutorial pretty good to get started. I went through the tutorial a couple of times and once I knocked all the rust off my Python knowledge I was able to pass lists of dictionaries to the Jinja renderer and get a basic version out.

On the note of Python rust, the thing that really got me the connection of Python keyword arguments to Jinja’s variables, but once I remember and got comfortable with method keywords (template.render(authors=authors)) things got better. (I guess by convention for Jinja I should be calling that context, but I’m very bad at using libraries conventionally).

But the left and right!

I’ve maybe given too many introductory programming questions in my life, but when I looked at trying to neatly organise data into two columns, my first thought is to check if we’re on an even or odd iteration of the loop.

Jinja, does not do that.

There might be a way, but when I made a variable and counted to it, it turned out it was being reset at the beginning for every loop (or possibly just never being changed). That was fairly disheartening, until I stopped and actually read the weird little blurb in the Jinja Tips and Tricks page, which I’d skimmed past a dozen times.

The loop.cycle() method is pretty neat. When you call it in a loop, it returns the 1st argument in the array passed in, every time it’s called. When you call it on the next iteration, it returns the second argument and so on until it runs out of arguments then it loops around in the arguments again.

The way to use this took me a little while to process, since in that example there, they’re changing on each row, but the thing I needed to konw is that can call the method again with different arguments, and the nth argument will always be returned. So I was able to update my template to either added the row start, or nothing, and then added nothing or the row ending.

This is how that ends up looking:

  {{ loop.cycle('<tr>','') }}<td>...</td>{{ loop.cycle('', '</tr>') }}

I also had to add a check at the end to add an empty table element if there was an odd number.

  {% if books|length % 2 == 1 %}<td></td></tr>{% endif %}

I also added a set of conditional tests to add bold <b> tags. Which really catches the thing that I struggled most to do when I was building these tables manually.

Other cleanup

I finally broke down and thought I’d ask ChatGPT a programming question. Unsurprisingly, if you were look at my search history (or listen to me grumble about files), I asked it how to open a csv file in Python. The answer was about as good as Stack Overflow would have done for me. (that's gotten more complicated since I asked, I guess). The answer was wrong (or more correctly missing a very important detail, but good enough.

I did get to learn that when you open a csv file, you can include the field names in the arguments to csv.DictReader, which is very helpful.

reader = csv.DictReader(file, fieldnames = ('name', 
'books_read_month', 
'pages_read_month', 
'books_read_year', 
'pages_read_year'))

At this point things bogged down, I did not expect Excel to ambush me with “I now show all data upside down” (like, no seriously, it shows all of the text upside down and the rows upside down) so that obscured some issues I was having with getting the data arranged properly. (I’m blaming excel for reordering the CSV, … see the upside down thing above). But with a little manual work I’m now able to get data out of my tracking spreadsheet and into some template HTML files which I can then put into my blog posts.

The thing that made me happiest was as I was working I realised how much duplicate code I was generating between the template and the CSV loader for most of the tables. The books and the authors are unique but all of the other tables share a format, and so first I was able to use the same template. Then I was able to tidy up more and just pass the file name to the one method and process all three tables with the same method and template combo. That nicely reduced the size of the generator script and made me feel like a proper computer scientist.

Next Steps

My current “hand crafting” set up is that I copy select data out of my main reading spreadsheet and export it as five CSVs, then I run the HTML generator over those and get five files with the tables. I then pull those tables into the blog post in blogger.

That really covers all of what I’m looking for and the data is simple enough that I don’t really need anything else for books. The original idea for the project was do stuff more or less by hand and I’ve automated the parts where I was frustrating myself. I might come back and mess around more, but for now I’m pretty happy with the book infographic.

The game infographic is a little more of a mess and the new format I’m thinking about needs a lot more data displayed and folded in different ways. So I think I am going to go on and play with Jinja to generate a monthly games infographic. I’ve done a little bit already so hopefully I’ll be able to pop out the March and April infographics before the end of May.

It’s been a fun project and was nice to take on something a reasonable size which I could finish in a reasonable amount of time. Jinja seems useful for a lot of other applications, so I’m glad I’ve at least had a chance to play around with it.

A Few Helpful Links

No comments:

Post a Comment

The Books I Read - December 2024

The Lays of the Hearth Fire series by Victoria Goddard is good. It's also long, but I really enjoyed spending most of December on ...