It has been over a year since I started publishing the Thursday Doors Recap page on Sundays, but It’s only been about a year since the process of producing that page has been on autopilot.
Several many lots of OK, a few people have shown some interest in the behind-the-scenes story.
That’s your cue to skip to the gallery – it gets geeky beyond this point. Before you go, if you’re a Thursday Doors participant, the Recap Page continues to generate over 100 additional clicks, and people who include ‘tags’ continue to receive the most.
When I first got the idea to create the recap page, I built most of the processing in Excel. It was fast, easy to create, and I could see the results as I built/tweaked the formulas.
And no offense to GP Cox, but I don’t want any of those links in the recap. Fortunately, I can filter many of them out. I don’t want links containing “gravatar” and I don’t want any links containing “nofacilities.com.” Filtering out those two strings reduces 10 links to one – “https://pacificparatrooper.wordpress.com/” – I don’t want that one either, but extracting 100 links from my page was manageable, so I continued with that until I found a better way.
Here’s your chance to think like a programmer. I can feel you shudder.
“https://photographias.wordpress.com/” is a link to the parent page of Sofia Alves’s blog.
“https://photographias.wordpress.com/2022/04/21/thursday-doors-lisbon-7/” is the link to Shofia’s Thursday Doors post. How can I teach Excel to know which is which?
If you said “count the number of ‘/’ characters, you get 100 bonus points. Since you’re still here, I’ll explain. The formula below is how Excel counts the ‘/’ characters.
LEN(B19) is the length of Sofia’s blog URL
LEN(B19)-LEN(SUBSTITUTE(B19,”/”,””)) is the length of her URL with all the ‘/’ characters removed.
The difference is the number of slashes removed. If it’s greater than three, it’s a link I want to keep. So, I sort the block of links by the column containing the number of slashes and I delete all the rows where the answer is three.
If you’re thinking, “the recap page rarely has over 40 links…” Give yourself another 100 bonus points. You’re right. Some of those links are duplicates. Duplicates are caused by people who leave a comment on my blog and include a pingback on their blog. Don’t take this the wrong way – I love the pingbacks – you’re spreading the link to Thursday Doors by including those!
When I run my script against my blog page, I get a table of entries that look like the image below:
I paste this into Excel. Then I can work with it.
- Order in which the URL was linked to my blog (either by comment or pingback).
Now, back to those duplicates. I sort the data area by the URLs. Then I use Excel’s Cell Highlight feature to highlight the duplicates and then I delete one of the rows of data. Note: I keep the row with the tags, but I give it the lower of the two numbers representing order of arrival.
To get the tags to look nicer, I have a column with the formula below:
=IF(ISNUMBER(SEARCH(“#”,C14)),SUBSTITUTE(SUBSTITUTE(C14,”,”,” “),”#”,” “),” “)
In other words, if the Tags column includes a hash character, I replace the ‘#’ with blank spaces. I do the same for commas. That’s hard to see because the formula parameters are separated by commas.
Once I’ve manually eliminated the duplicates, I sort the remaining 30 or so data rows by the order in which they were posted. Now that the data is formatted and sorted, it’s time to create the table. Actually, since I’m using Excel, the table is constantly being formed and reformed.
If you’ve ever seen the Recap page, you know there are three columns: The URL to the blog post, the location of the blogger, and any tags they included. Here’s how I do that:
=HYPERLINK(G13,L13) – This formula creates a hyper link from the text in L13 and the URL in G13.
This function might not be familiar to many Excel users, but it’s one of the most power functions Excel offers. VLookup() and it’s companion HLookup() allow you to look for a string in a table and retrieve something associate with that string. In my case, I am looking for the URL of the blog’s parent page in a table of information provided by the form the bloggers filled out online. This gives me the friendly name of their blog (which I don’t get if they only leave a pingback) and the location they provided.
In layman’s terms, I look for the URL in the table. If I don’t find it (ISERROR), I leave the result blank. If I do find it, I take the result from the third column of that table. Then I copy the tags from the column where the ‘#,” was removed. The result table is shown below:
Next, I open WordPress, edit the Recap Page, delete the old table, copy the result table from Excel and paste it into the WP Editor. Then I set a couple parameters and I’m done.