Sticky Notes!

Cherry Blossoms with To-Do List, a montage by John Farrell

I’m struggling to get back into my motivated groove for Extended Stats since all the holiday stuff. Although my wife is away and hence is unable to distract me, the dog is not and she demands an inordinate amount of attention. I’m also doing a lot of cooking of things that I can’t make for my vegetarian wife. This evening’s nasi goreng was pretty good if I do say so myself. But not much coding got done.

Today’s project was the top-right sticky note, which says “Express & EB”. Express is node.js software for writing a web server, and EB is Elastic Beanstalk, an AWS technology for scaling web servers – essentially you tell it “Here’s my code, put it on a machine. If it gets busy start some more machines running the same code.” I use EB at work, but I’ve never used Express before.

I’ve been getting the feeling that AWS Lambda is a little expensive for some of the things I want to use it for.  In particular, the note below “Express & EB” which says “geek buddies”. The plan for that note is that if you’re logged in you’ll be able to tell the site which other users are in your game group / family / other peer group. And the plan for that involves an autocomplete field where you start typing a BGG user’s name, and I give you all the valid completions for what you’re typing.

That’s easy enough to code, but it usually involves one HTTP call for each character typed, and as I pay for each HTTP call to a Lambda, I’m not very keen on that. So the plan is to make an Express server which can handle very tiny calls like that for a pretty much flat rate of a couple of bucks a month. That should decrease my Lambda costs (which I should also write a post about).

However, today’s plan to run Express on Elastic Beanstalk foundered at the bit where I tell Elastic Beanstalk about the database. Omigod, AWS security is basically impenetrable twaddle. At least, it is to me. At work we tell the servers about the database a different way which I don’t want to use here, so I would like to configure the database in EB. I ended up with EB and the database both in virtual private clouds, but in different virtual private clouds. Is that good? I don’t know. It was like when you can’t find two socks the same colour. So in the end I got cranky and dumped the EBs and will try to do some reading so I can get half a clue for the next attempt.

In happier news, I did manage to fix a big. BGG started sending my ratings for games people hadn’t rated is “N/A” instead of just completely absent, so I was trying to store “N/A” as a number which broke stuff. Thanks to a new user for pointing out to me that something was broken.

The next bug I found was that if you have many many plays in a month, for example 3750, BGG tells me to stop asking for so many files all the time. As the Lambda wants to get its job done quickly because I’m being billed for the time it runs for, I can’t just sit around and do nothing. So I need to come up with a plan for dealing with many pages of plays over several lambda invocations. Something nice-ish will spring to mind eventually, I suppose. That technique could possibly also be adapted for users with very large (BGG) collections, e.g. 100000 as I have seen. Where’s my thinking cap? I need my thinking cap.

 

Bugs Bugs Bugs!

Hello, I’m back! I spent two weeks gallivanting around the world with my wife (who was working), and then I spent one more weekend playing Ingress in Canberra. And now finally after all the unpacking is done and the clothes washing is finished, I can get down to some programming. I last saw my wife in Dubai and don’t expect her home for a while yet, so I hope for minimal distractions.

So while I was away the system was running happily by itself, by which I mean it was unattended and nobody much was looking at the site 🙁 so really all it was doing was costing me money. The last bill arrived while I was in Italy so I haven’t looked at it in detail yet, but I did investigate why the Lambdas were still costing me more money than I expected.

I found a new way of arranging the Lambda usage graphs that makes more sense (and that’s what the pretty picture above is). It’s pretty clear that blue was doing way too much work, so I investigated that. That’s the bit which takes the list of what games someone has in their collection and puts it into the database. One user had many versions of games in their collection, so I kept trying to put the same game into the database over and over. I’ve sort of fixed that (but what can I do if someone gives the same game two different ratings?) and that particular problem seems to be fixed. In a week or two I’ll know how that has affected the performance (and hence cost) of the system.

The next bug I addressed was one that was found by alert reader Jeroen, who pointed out that if someone recorded more than 100 plays in a month, I only used the first 100. I knew what that was, so I investigated and fixed it. Then came the hard bit – fixing all the data that was wrong. Even finding it took a while – essentially I had to find all the months for all the geeks where there were 100 plays recorded. There were 2750 of those, so I then had to mark those as needing to be reprocessed. That was quite difficult to figure out. Ideally I would have just said “reprocess all the months”, but I know that that takes more than a month to do, so I had to get exactly those ones. I seem to have achieved it, but after all that hard work, the database was a bit puffed out and needed a rest. In technical speak, I exceeded my CPU quota and the burst balance was very quickly heading to zero as well. So now that I’ve figured out all of the months to fix, the database is too tired to do it and I’ve told the downloader to rest. I’ll turn it back on in the morning.

So one of the good things about my wife being away is that I can now put sticky notes of things I need to do all over the house. There’s a rather nice Japanese flower picture in the living room here which is now covered in notes saying things like “geek buddies” and “API keys”. Actual user-facing features haven’t even made it to the sticky notes yet. However I shall be extremely pleased if I can get these things done in a couple of weeks.

Semiology of Graphics

A couple of weeks ago I was researching some Vega stuff (remember, Vega is the charting package I use for the site) and I found a reference to some software called D3. D3 is software that’s used to transform raw data to data for visualisation, and Vega is heavily based upon it. I don’t know the exact boundaries between the two. So I was reading about D3, and I found mention of a book called “Semiology of Graphics”, which is apparently a seminal work in the area. So I went to Amazon and bought a copy.

It arrived on Thursday. It’s a massive heavy book, like the Edward Tufte books, and it’s rather more academic than I anticipated. Because y’know, a book about semiology was always gonna be light reading, wasn’t it? Even before I sat down to read it I got an idea for a graphic for the site, so I grabbed a blank note book and a pen and started reading Semiology of Graphics.

So far I have got past the bit where he defines the terms he’s going to be using. Oh goodness I hope I am past that bit. However I have about 7 pages of ideas so far. I hope I am getting to the good bits soon.

Certainly I’m thinking about the data that the site displays in a different way. Geeks and games are qualitative data – they are incomparable and unordered. However they can be classified – geeks from Australia vs those from Germany, games which won the SdJ vs those which use Dice Rolling.

On the other hand, ratings and numbers of plays are quantitative data – they can be sorted. The quantitative data are used to describe the relationships between the geeks and the games. There are also a few dates involved, which are quantitative data as well.

I guess the site is about illustrating the interplay between the quantitative and qualitative data, particularly with the qualitative data being used to place the games in 2D space. Or, as the book points out, in 2D space with a variety of techniques being used to illustrate further dimensions of the data.

The image is the new version of the Rating vs Months played graph. The shapes and colours mean the same thing, because of colour-blindness. The chart illustrates that the games I keep coming back to are family games and strategy games. The family games are sort of a light strategy genre, so it seems that’s what I’m like. The same graph for cyberkev63 (shown below) has much more prominent red triangles for party games, which he is well-known for.

So that’s where the Semiology of Graphics has got me so far.

This weekend I’m off on a trip to Europe (not for Essen, as far as I know), so I will be distracted for a couple of weeks. I wish life would not get in the way of the development of the site! I guess the original site evolved over a few years, I should not be so impatient that the all-new even-better site is taking some time.

The Geek Page

One of the things that people liked about an older version of the site, from about 2009 or so, was that all of their stats were on one page. There were a few peripheral pages, but mostly it was just one. I had problems with that, because with most of the stuff on that page being generated on the server before sending it back, people who owned and played a lot of games had pages that took minutes to load. I figured that people didn’t want to wait for that long, and probably just scrolled through pretty quickly anyway, so I split the page into 6 tabs, and put less stuff on each. That’s the way the old site is today.

For the new site, I want a different feel – less like statistics and more like a game. In particular, it seems to be sort of a Stefan Feld game, with lots of complicated bits that (hopefully one day) work well together. So I will be encouraging people to explore a lot, and of course I need to build the site so that there is some exploring to be done (instead of lots of dead-ends like there are at the moment).

Nevertheless, people need a place to bookmark, and that is what the geek page is. Now I know it looks kinda gaudy, but Gaudi was a really famous guy and Hundertwasser is not so bad either, so maybe there is hope for me. But seriously, once I define the basic style of the site I will take some advice on the colours.

The Geek Page

At the top there, we have the navigation bar. That’s so that a few singleton links are always available. I just had a thought that the Privacy button and maybe the Github button would be better in a footer. Then I could put other stuff in the navbar, and the navbar and the footer bar together could look a bit like a score track around the edge of the board.

Then there’s the page title and the Log In button. The Log In button stands out like a sort thumb there, and I don’t like it very much. I want it to expand into some sort of user identity. At the moment, if you’re logged in it turns into a Log Out button and has your user name underneath, and your user name is a  link to your user page. Your user page is to a user what the geek page is to a geek, except that users aren’t very much yet so that page is really dull.

Then we get to the large coloured panels, each of which is a hyperlink to another page. Well, some of them are dead links which go through to the home page, but I’m working on that. So when I code up new features, they go behind one of those panels. The panels themselves contain a few statistics, some of which are not available elsewhere on the site. That’s your reward for having to put up with my colour scheme.

Below the panels is the News section. This is black and white as all good news is. I figured that this was a good way to tell users about new things, but also to give the impression that the site is continually improving. With luck, people will see the news and go find the new things I mention.

Finally there’s the Table of Contents. It’s a little out of date and it’s also a little bit rubbish. I intended that the hyperlinks take you to features on other pages, but as the other pages depend heavily on JavaScript, they take you to the other page and then the thing you’re going to hasn’t loaded yet because the JavaScript hasn’t run. I’m not sure how to get around that yet, hence the disrepair.

Finally, eagle-eyed users will eventually notice that some pages take URL parameters, e.g. “?geek=Friendless”, and some don’t. I’m not sure whether I’m being cunning with that or not. What happens is, if you put that parameter on the URL, it sets the geek for the page to whomever you said. And then if you go to another page without a geek parameter on the URL, it remembers who it was set to previously. This is somewhere between convenient and confusing, especially as if there’s no parameter in the URL it’s impossible to know what geek the page is for. That’s on my very long list of things I need to sort out at some point.

Why Yes, There Is Method to My Madness

Just this evening I released an update to the site. I added a new panel to the geek page. In case you don’t know the geek page, it’s something like:

https://extstats.drfriendless.com/geek.html?geek=Ozjesting

It’s the central page related to a particular geek. At the moment there are 6 panels, of which you can click on 3 to get through to other pages. On those 3 pages there are various tables and graphs and so on.

The question you are no doubt asking is “how do tables and graphs get allocated to those pages? Does DrFriendless just pick up the nearest page and jam some more stuff on it?”

That’s how I used to do it on the old site, pretty much, but I’m a grown-up now. These days I’m thinking about performance, so I group features (that’s what the old site calls tables and graphs) by the data that they use. For example, the Owned Games page uses games data (name, BGG average, etc) and geek game data (your rating for the game, whether you own it), for only the games that you own. When the page loads, it goes and retrieves exactly that data once and gives it to all of the features, which then display using the same data set.

If you understand selectors, then you’ll understand that a selector can be used to choose which games to return, so of course I do. The ultimate goal is to be able to change the selector on the page, so that if you want to display games you own but not books or expansions, you’ll be able to do so. I’ve made some progress on that plan (on the favourites page) but it is not widespread yet.

The other factor in the data query is what data to return for each game, and that is fixed for the page. The pie chart for your ratings of games you own will never display plays – there’s just nowhere for them to go – so that data is unnecessary and should not be retrieved. So you don’t get to mess with what data is available for each game.

I guess it’s possible to display the owned games page using games that you previously owned, but that might be weird. On the other hand it might be genius, we’ll see what people come up with.

Now, there is even more method. Given that the page does just one query for data, I can cache that. Browsers these days have a thing called browser local storage where the web page can store stuff for later. So I could save the data for the page in browser local storage and just get it back from there later. This means (a) you wouldn’t have to wait to get the data and it wouldn’t come off your bandwidth allowance, and (b) it might be a bit out of date. So it would be best if that feature was used when you were on your phone and not so much on your desktop. When I get to putting that in I’ll make it configurable.

In other news, there is so much other news! I have more blog posts to write, but I also have more code to write and that’s more important. Stay tuned.

Welcome to New Users!

Well, I think it’s about time. In January I suspended addition of new users to Extended Stats due to capacity problems. I then procrastinated for a few months because I wasn’t sure how to solve some technical problems and didn’t know how to proceed. Then in June I solved the problems and progressed onto a proof-of-concept phase.

Development activity on Extended Stats over time (from Github)

However Extended Stats is not just about programming, it’s also about management of the site and management of the users on it. By “management of users”, I don’t mean “herding”, I really mean communicating with them, setting expectations, and participating in a conversation with them about the site. And for a few months or maybe years, I have been really bad at that. And I will probably will remain really bad at it, that’s just what I’m like.

So one of the tasks I have to start doing now is onboarding new users to the new site. New users don’t get to be added to the old site as it’s overloaded, for which I hope they will forgive me. However I’m hoping that the new site will satisfy some of their expectations at first, and any more over time. There’s a post here about why the new site and the old site are so radically different:

Why Rewrite?

Now I guess new users will want to know WHERE’S MY PRETTY PICTURES? SHOW ME THE PICTURES! That’s a problem I’m still working on, but a good starting point is:
https://extstats.drfriendless.com/geek.html?geek=Friendless

but of course you should change “Friendless” to your boardgamegeek user name. That page will remain an index of all of the things related to you. In particular if you go through to Favourites or Collection there are some nice bits.

Now, if I still have your attention, I have a bit of a chat about privacy in this post:

Watch Out, There Might Be COOKIES.

I just had a look at auth0 (which manages logins for me), and the site now has 7 users with accounts, which I think is pretty impressive given that logging in achieves nothing so far :-). By the way, auth0 does not let me access those users’ passwords – it always worries me when I log in to a hobby site like this that I’m giving my password to some unknown nerd who might try to use it on other sites. Don’t worry about that, I cannot see your passwords as far as I know. I like it like that. Here’s more chatter about users.

It would be foolish of me to not mention to a dedicated reader such as yourself that the site is not free for me to run. I blog about the costs every month when the bill arrives. If you want to help, find the Patreon button on the site.

So, what happens next? Well there are a whole bunch of features that I need to port to the new site, so I’m going to be busy. I’m going to be distracted by a few other things until about November, so it’s probably going to be slow. One idea I had this morning was about how to prioritise what I should implement next. Of course, it involves writing code, so that in itself is a feature that needs to be implemented.

Thank you for using Extended Stats! I hope that the site can grow to meet your expectations, and I hope that you enjoy the ride!

Before and After

I mentioned a couple of weeks ago that I needed to do a big survey of all of the features in the old site, and make a plan for getting them onto the new site. And that I  had a big block of drawing paper specifically for such situations. Well, that bit’s true, but when it came to the crunch, I was sitting in bed with my laptop and couldn’t be bothered getting the drawing paper, so I had a(nother) look at Trello, and discovered that Trello is a great tool for organising a bunch of pictures into groups. So now I have a Trello board with the work that needs to be done to get the new site being as useful as the old site.

Trello board

I’ve been doing a lot of under the water work on the Favourites page, and spent the morning extracting bits of that code in the hope that I could use them again for other things. It was pretty cool from a programmer point of view, but useless from a user point of view. But then I decided that I would take the good bits that I’d extracted from the Favourites page and apply them to the User Collection page, which is the first table I did. I sort of wondered where that page was going, but it was supposed to correspond to one of the columns on the Trello board.

About then was when I realised that I’d deleted the user collection source code in a fit of cleaning one day. It’s on Github, I could have got it back, but when I consulted Trello to find out what the table on that page was supposed to look like, there wasn’t one. So I didn’t bother. Technically that page could hold a table which included differences between your rating and BGG averages, but that’s on BGG as Aldie stole the idea from me many years ago. Maybe I’ll make that table one day, but this is not that day.

There were three things in that column, but I only had time to get one done, so I chose a pretty one. It was the ratings by published year graph which looks like this on the old site:

You may remember from previous posts that I’m using a charting package called Vega. Vega is lovely and horrible at the same time.  It’s very powerful and the charts are very beautiful, but Vega does not play very well with Angular, so it’s always a bit of a pain to get it working. However after a couple of hours I got it sorted.

If you click through that image you will get to a live demo on the test site. (I usually link to the test site from the blog, as the live site gets changes about 24 hours later due to the CDN.) As I write this, there’s still more work to be done on that chart (e.g. tooltips, configurable start year), but it’s about as good as the old one. So I’m faintly pleased with that work.

In other news, the Favourites table got The Most Advanced Feature Yet! As well as having the Documentation button, and the Charts button, it got a Configuration button. The configuration button lets you change the selector that the table uses to load games. And of course if you do that then do charts, they are based on the new data that you loaded. This is very exciting to me, and maybe one other person.

The plan is that logged-in users will be able to store their preferred selector for each page. I’m not so far off achieving that, despite the login button being a bit broken at the moment, and not much functionality having been achieved on other parts of the site. But, it’s a brave new world that has such features in it.

Considering User Experience

My wife is an information science academic, meaning that she spends her time thinking about how people interact with information. For example, what’s in your mind when you’re doing a Google search. Or when the doctor tells you you have to stop smoking or you’ll die, what bit of that don’t you understand. She talks about things like “affordances”, which means sort of “a thing that’s available for you to do things”, e.g. a button or a door handle.

So user experience is not an alien concept to me, but you wouldn’t know that given how crappy the site has been so far. I’ve been doing that thing like a duck does, with a whole lot of furious activity under the water, while above the water the site has been serenely useless. Today’s job was to address that.

During the week I got the geek page working. Actually, no, that was this morning. It’s been a long day. The geek page is like this:

https://extstats.drfriendless.com/geek.html?geek=Friendless

It’s a page all about one BGG user. It has links to various bits of the site, some of which I’ve even done and work properly. So if you can get to the geek page, you can find your stats.

Then today I coded up an idea I had last night for the front page of the site (extstats.drfriendless.com). It’s like a FAQ with stats. I think geeks will get it, and they’ll play with it, and be able to find the rest of the site. If you have ideas for FAQs, please suggest them to me.

Also, calloo callay, I got the blasted login button to go to the right hand side of the page.

Oh What A Tangled Web We Weave

I’ve spent the day working on cleaning up a few things. When I started this project I didn’t know what could succeed, and I tried a few different ways of doing things. Time has shown me better ways, so I’ve gone back and revised a lot of things.

I’ve extracted two JavaScript libraries – one for the various data structures that I send around between the server and the client, and one for common user interface components (in particular, the charts). Extracting the charts out should make it easy to stick them into other views. I also need to extract the documentation component, and push some more stuff into the table component so that they can be reused as well. The tabbed view in the documentation has me a bit confused at the moment. I don’t really understand the template stuff they’re using, so I can’t fiddle with it.

The new-look navigation bar

I also updated the navigation bar to be more garish, and kinda like how it has turned out. It’s starting to give the site that look of game pieces all over the table that I’m going for.

I’ve also been working on a page that represents a geek. The more I think about the old site with the geek page with a variety of tables and diagrams on it, the more I think that’s not suitable for this site. On that site you scroll through a lot of stuff, I want this one to be more about poking around and discovering things.

The way I see it at the moment, there will be a relatively small number of different pages, but each of them will be quite complex. I need to sit down with the old site and plan out what those pages will be. Then from the geek page shown below, there will be a number of coloured squares that link to those pages. The squares will contain a little bit of data of their own, but I was too tired today to put it in there so they’re mostly blank at the moment.

You might also notice the login button hanging in there causing trouble still. Between the CDN caching stuff, Auth0 demanding https, and insolent damn thing not want to go to the right-hand side of the page, it’s wearing me down. However at least now it’s a nice colour. Work will continue on that.

But anyway, this thought process led to the conclusion that the user collection page, e.g. http://test.drfriendless.com/collection.html?geek=Gecko3D, which is not very useful, is not very useful because I’m not clear where it’s going. And that’s practically the last step before it going somewhere.

In my method of work, ideas come together when I make a Big Diagram. I even have a big drawing book for such events. I think I’ve arrived at a point where I know what the next Big Diagram has to have in it. No really, this is a great thing!

Let There Be Pretty Pictures

I went to a MongoDB meetup a couple of weeks ago. I don’t use MongoDB, but I do like the meetup and I do think I need to know about NoSQL databases. There was a talk about MongoDB Charts, which looks like a pretty whizzy product if you use MongoDB. I was particularly impressed by the pretty charts, so I had a chat to the speaker afterwards, and he told me that they use Vega, and that I could too. So I put that on my list of things to do.

Now having spent way too many days writing code for the site that was not very cool, I decided to spend today trying to shoehorn Vega into the site. It looks like it worked!

BGG Rating vs Friendless’s Rating

If you want to try it yourself, the place where it definitely works at the moment is http://test.drfriendless.com/favourites.html?geek=Friendless. test.drfriendless.com is like extstats.drfriendless.com but without the https – the content distribution network which gives me the s in https also caches my pages all over the world, and it doesn’t like it when I update stuff all the time. So when I’m programming I send the changes to test.drfriendless.com.

At the top of the table is the Documentation button, where you can find out what the table data is about. So next to that is the Charts button, which reveals more buttons to display charts. When you use one of those, the I give the data from the table to Vega and tell it to draw a chart. Because I’m using the data from the table, there’s no network call related to any of the charts at all, and it’s really quick.

So far there are just 3 charts, because as I was approaching getting them working I realised that I really don’t know anything about Vega. Vega is very clever and very complex, and I haven’t read any of the doco yet. I think I’m off to a good start though!

Of course once I’ve modified the tables so I can hook in the selectors, you’ll be able to control the range of the data. And when I figure out how to work Vega, there should be a huge number of charts to choose from.

I’m very excited. I’d better go read that doco.