Problem Restatement

TL;DWR If you’re not a long-time Tabletop gamer with a collection of PDFs, this probably isn’t for you.

I started in digital game formats pretty early. Not early, early. There were probably files up on Sunet before I could dress myself – but still, fairly early. I was starting to try to publish in the early days of most of the modern small press options, so I tried a little bit of everything, particularly digital formats.
I tried Hummingbird, Digital Paper, and anything else I could find in the days before Acrobat added Zip compression. Hard to believe there was a toss-up in that race…. I remember handing a demo floppy to a very confused Mike Pondsmith one GenCon. Fun times.
This leads to a long growing battle I’ve had with a 20 year plus old collection of purchased, gifted, demo, or produced PDFs. 53Gb. Excluding straight-up fiction and comics. It’s a bit much. Trying to organize it is another nightmare entirely.

I gave up on Calibre and wrote my own that works okay save a damned memory leak on mass import. But this doesn’t solve the ultimate problem of “What is this file?” Trying to solve that, I landed back on a tool that was amazingly useful in the early days of digital music, when we were all ripping our CDs. CDDB. If you’re young enough not to remember this dark age ( or the darker age after CDDB became a commercial entity ) CDDB was an online database of CDs that MP3 rippers used to help you label, tag, and otherwise identify the tracks on a CD when you transcribed them to another medium. It was a huge time saver and pretty accurate.

That’s what I’m looking at doing. Maybe I’m a hammer in search of nails here, I don’t know. It seems like a real problem.

Working Thoughts

Admittedly, this leans HARD into the way I’ve thought about this for my collection. I’m currently working in Go with a mariadb backend because I enjoy using Go and this seems a relational task that mariadb can handle just fine.

Features will include:

  • Anonymous queries with partial metadata
  • Anonymous and authenticated submissions to update or append to the database
  • Fixed and freeform fields
  • Provide query URLs for online sellers for lookup evaluation

What is stored?

The Card Catalog stores four categories of facts about a book. Some of these have values attached, others are binary, true/false values indicate a property exists. These categories are Product Line, Production, Gameplay, and Instance.

Product Line

This covers the whats and withs of the product. Is it an RPG? Does it require any particular products to be usable? Is it for a specific edition? These include:

  • Game Line – What is this product meant to be used with? Multiple values are acceptable and should align with how the publisher promotes the product rather than customer inferences. For example, a World of Darkness product should not list Chill even if the material is easily reworked. Conversely, many OSR fantasy supplements list a particular implementation first ( OSRIC, White Box ) and then OSR Fantasy. Editions should be noted.
  • Game Systems – The game systems, if named, as specified by the publisher. With some products, this is a duplicate field. However, there are enough open/licensed systems ( and those begetting systems ) being used these days that a breakout should exist. Rules editions, where appropriate, will also be listed here. Examples: Cypher System, Hero System, D20, FATE, Mage Revised Products intended for multiple unnamed systems should be listed as Multi-system. Those with multiple, named systems should list each.
  • Requires for Use – Does this require any additional products to be played. The classic example, D&D PHB

Production

Production holds the facts that we typically see when we look up information about a book in any form along with a few similar things that are relevant to the RPG Ebook ecosystem. These include:

  • Publisher – The manufacturer of the product In the case of products that have owned a product, all should be listed provided we are looking at the same product with different owners. A new or revised title should have a distinct entry.
  • Sequential Title – For periodicals or series titles This is used for the masthead title of magazines, newsletters, and anything else done as a series.
  • Title – The title of an individual product
  • Author(s) – The credited writer(s) This should include ANY level of contribution listed on the credits, title page, and or Colophon
  • Editor(s) – The credited editor(s) This should include ANY level of contribution listed on the credits, title page, and or Colophon
  • Artists(s) – The credited artist(s) This should include ANY level of contribution listed on the credits, title page, and or Colophon
  • Page Count – “Actual” page count, as opposed to highest numbered page This is, quite literally, the number of pages, as seen by the document viewer.
  • Page Format – Is this Screen oriented PDF or is it in one of the common paper formats
  • File Format – PDF, MOBI, EPUB, etc.
  • Printer Friendly – Is this a low-ink, toner-friendly version of this book?
  • Black and White – Is this a Black and White edition?
  • Full Color – Is it a color edition?
  • Indexed – Does it have an index?
  • Thumbnails – Does the file have thumbnail navigation?
  • Interactive – Does it include interactive features such as a fill-in character sheet.

Gameplay

Here is where we label what kind of game it is and where it falls in the gaming spectrum. Here is where the most subjectivity will come into play and I would expect the most debate.

  • Game Type(s) – RPG, Miniatures, Boardgame, CCG
  • Genre(s) – Is it a Superhero setting? Horror? Cyberpunk Fantasy?
  • Subgenre(s) – This is for narrower categories of the above. Lovecraftian Horror, Science Police Super Heroes, Arthurian Fantasy, etc.
  • GM’d – Does this game use a referee
  • Player Agency – Does this game allow the player to alter events beyond conflict resolution? Values of None, Low, Medium, and High.
  • Crunchiness – How deep are the rules? Lite, Moderate, Heavy, Crunchy

Instance

Instance data is about a particular “publishing” of a file. One of the best parts of being in a digital environment is the publisher’s ability to update, enhance, and otherwise change a book after publishing with minimal effort. As such, there are very likely to be multiple versions of a book. Instance data helps prevent there from being multiple records for what is essentially the same title. This includes:

  • Filesize
  • MD5
  • Published Date per the file

Initial “Book” data structure:

  • Name
  • File Date
  • Published Date
  • Publisher
  • Game Line
  • Author(s)
  • Artist(s)
  • Editor(s)
  • Top Level Genre Tag
  • Blended Genre Tag(s)
  • Summary
  • MD5
  • Product Code
  • Online seller product code

Extended Attributes:

  • Subgenre tags

Online Seller Lookup:

  • Formatted URL
  • Site/Seller Name




Posted

in

by