To do (excerpt): redesign, catalogue, PIM-KB, Semantic EPUB

I’ve had some ideas for things and things to do at some point in the future, but I can’t do all of them at the same time and most ideas need more thought (and research). I’ll list them here, not ordered by priority in any case, hopefully to inspire more ideas and spark discussion.

Ultimate visit

In July, on my first day at the office, I was asked to present a paper at a conference workshop in Indianapolis. What a great way to start a job!

It was my first time in the United States, so I had to see something other than the conference hotel. With a couple of extra days before conference duties began, no NASCAR races during my stay and without a long list of things to see in Indy, I decided to see if I could join a game of Ultimate Frisbee. After all, the game originated in the USA and it’s played by fun people.

It turned out there are plenty opportunities to play pick-up games. I ended up in a garden at Butler University, where people regularly play pick-up games. It was hot, humid and great fun, even though the four-person team I was in lost all games.

As expected, players were fun locals, one of whom agreed to give me a ride and eat something American in a place I wouldn’t have discovered myself: nachos deluxe with a club sandwich on the side at the Old Point Tavern. I could tell him we speak Dutch in the Netherlands, and we discussed some of the cultural differences between America and the Netherlands. Being used to drinking good quality Dutch beer from only a couple of brands, it was good to know that in Indiana(polis), or maybe in all USA, good beer is brewed in small local breweries. Homebrewing, which is rare in the Netherlands, may happen more often in the US. In fact, someone at the pick-up game needed to get rid of some old stock of (mostly) Belgian homebrewed beers. Lucky me 🙂

(Some photos will go here, but I realised I didn’t ask for permission to publish the group photo we took. Coming soon, I’m sure.)

And to finish off this post, a situation of coincidence: tonight I learned that just hours after I played in Indy, someone from Indy (well, Bloomington, “around the corner” from Indy) played in The Hague – she actually knew the game at Butler University. Haven’t asked if she had a sister that I had talked to on the bus to the game. Based on looks I’d say it wouldn’t be impossible, but how much coincidence can you handle?

A new form of Twitter spam?

Since a couple of weeks I have been attracting a new form of spambots. It took me a couple of minutes to realise that the accounts were probably not controlled by humans. Here’s what I get:
I receive email that Aurora Santee (@Ritanbrj) favorited one of your Tweets!. The tweet was in Dutch and nothing indicated that Aurora Santee could understand it.

It’s becoming a pattern:

  • the username (@Ritanbrj) doesn’t have anything in common with the real name (Aurora Santee)
  • the real name isn’t what I think are common Western names, but they are feminine
  • the account has a bio and it may include an URL (don’t click it, of course)
  • the few accounts that I looked at had about 16 tweets, most of them some sort of quotes; also about 50 following and 20-30 followers
  • all account activity before the favoriting of not so random tweets (the five bots favourited two tweets) happened on the same day

With that said, there are similar bots that favourite tweets and just advertise “buy Twitter followers” in their timelines. And there are actual people favouriting some of my tweets… 🙂

    “Kennis over publiceren” converted to EPUB

    At a panel discussion about publishing cultures in academia on the 18th of December 2012 (which unfortunately I didn’t attend), De Jonge Akademie published a little book on the topic [zotpressInText item=”2GMXVGV6″].

    Although the book’s paper size is almost the same as my Sony (PRS-T2) e-reader’s screen size, the PDF version isn’t really readable on the device. The letters are too small, even when most whitespace is removed. Because I wanted to read it, and preferably on my e-reader, I converted it to EPUB myself. Here are some observations about the process.

    My first attempt was fully manual: I had opened the PDF in PDF-XChange Viewer and copied the text from the document to Sigil. This introduces anomalies, as markup (headings, line and paragraph breaks etc.) and formatting (e.g. italic text, superscripts) were lost. It’s a lot of work to restore, even for this PDF of just 86 pages. I quit during processing of the second chapter.

    The second attempt still took some work, but the first step was easier already. Calibre was able to convert the PDF and create an EPUB file, saving most of the markup and formatting and even the cover.
    There was a lot to tweak, though:

    • soft hyphens at the end of lines are not removed in the conversion process;
    • most of the uppercase letters were stored as (and hence copied as) lowercase, including chapter titles, quotes in ‘small caps’ and “de jonge akademie”;
    • text in footers ended up in the middle of the text (although this also happened when manually copying from the source document);
    • tables were torn apart (but this may have been an option in the conversion process that I should have turned off);
    • front and back cover were apparently stored as one image with the cutting marks in the PDF, and had to be cut out by hand, stored as separate JPEGs and linked to in the EPUB;
    • in some phrases that were in italics, each word had its own set of <i></i> tags;
    • I recreated the box around one paragraph in the introduction;
    • I added as much metadata as I could find in the original to the EPUB;
    • the interviews with members of De Jonge Akademie had no markup, just formatting – I made them ‘real’ chapters by putting the title in <h1></h1>;
    • I moved one of the interviews from the middle of a chapter to the end of the chapter, to not confuse the table of contents creator.

    There is probably more that can be done, but this seems enough for now. I accept suggestions for improvement of the result and the process (though I probably will not do this again soon).

    The resulting EPUB-file can be downloaded. This derived work is available under the original licence (Creative Commons Attribution 3.0 NL), so you can (e.g.) improve it without asking.