EXTENDED STATS BLOG – Page 7 – Scroll down to get to the blog posts I don't understand WordPress at all.

August 12, 2018

What Use Are Users?

I spent most of yesterday working on login, AGAIN. I think that’s the fourth full day I’ve put into that subsystem. Except for the few minutes I spent to stick the navbar onto all of the pages. It’s ugly, but it’s kind of a fundamental component as it helps users discover what pages there are on the site.

One of the entries in the navbar is “Data Protection”. This is not a very interesting topic, but thanks to the European Union’s General Data Protection Regulations, it’s something I probably need to deal with at some point. Having contemplated retrofitting GDPR compliance to the website at work (which is unnecessary because we only conduct business in Australia), I decided that the best way to deal with GDPR would be head-on up-front.

This is why the only thing on the User page (https://extstats.drfriendless.com/user.html) is a JSON dump of the data I store about a user. I was starting work on configuring user options, but I got bored and will have to spend probably another day on that. The sorts of things I would add first would be:

screen display name as other users see it
whether you consent to have your personal data shared with advertisers
what your BGG username is
in fact, what your BGG usernames ARE.

That suggests a few things about the future of the site. First of all, advertising. Yes, there will be some, unless I make sufficient money to fund the site some other way. In fact I tried to turn ads on already but Google refused for unexplained reasons. One day I’ll try again.

Secondly, other users seeing your screen display name. I have to admit I don’t know exactly where that’s going, but maybe one day add the ability to play games online. I’m very interested in coding games, but I guess I had better get the site going first.

Oh yeah, and your BGG user names. Some people manage multiple BGG accounts – theirs and their spouse, and their games group. So I would like to make switching between those accounts easy. Maybe I will also add geekbuddies, so that you can easily compare yourself to your buddies, or treat them as a group, or something. I did once develop an app to help decide what you and a bunch of friends might like to play, and I would love to bring that onto the new site.

The missing page in all of this is the Geek page. The Geek page will be the central place for the site’s data about a particular BGG user. Naively, I expect it to be similar to the pages in the old site where a bunch of tables and graphs are presented. I don’t think that’s practical though, for several reasons.

The web technology that I’m using to write the interactive parts of the pages, Angular, is very clever. However the one thing it can’t do is play nice with other Angular applications on the same page. Even on a page like the front page or the user page, the application on the page and the login button interfere with each other. I haven’t seen any particular errors related to that, but I don’t want to take any risks. So I need to figure out some solution to that – maybe isolating them from each other in IFRAMEs, or packaging them as Angular Elements (which don’t work very well yet).

Another argument against the pages full of features like the old site has, is that that’s a pretty archaic way to do it. All of the graphs there are generated on the server side, and the PNG data is sent to the client. As JavaScript can do some very cool stuff these, the modern way is to use the data that was loaded from the server and draw the graphs in the browser. In fact, see my next post which I haven’t written yet for more details.

This suggests that the new site will be substantially different from the old site. With selectors being more shareable, pages being much more suited to interactivity, and fancy things like graphs being easier to do, I am guessing it will be a lot more interactive and encourage exploration. What sort of experience that adds up to remains to be seen.

Finally, as I write this, the downloader is struggling to complete the download of the plays data. By struggling, I mean there are 16 files left to process out of about 300000, and those seem to be broken. So I’ll go fix that and then write another very exciting blog post.

August 5, 2018

The Bill for July

I got the AWS bill for July, and it’s not quite as amusing as the one for June was. It’s not going to destroy the project, but it does show that I need to keep an eye on costs.

Going through the significant line items, CloudWatch is one that I didn’t really want to see there. CloudWatch is cool – it’s where a lot of the graphs that I post come from. I had 4 dashboards showing me graphs, and it seems I only get 3 for free, and the 4th cost me $2.87. As I didn’t use it anyway, I trashed it.

The other component of CloudWatch is a charge for data sent to the lambda logs. It’s not for storing that data, it’s just for getting it there. I was warned weeks ago that that was going to be a problem so I cut down the amount of logging I was doing. Still, there has been a lot of activity this month due to the plays download, so there was always a bit of logging from that. Let’s see how it goes next month.

The Data Transfer costs are small so far, but I’m not sure whether they’re for me downloading data or for me sending out requested data. If it’s the latter, the site is in trouble when I start advertising it to 3000 users. Let’s watch that one.

The Elastic Compute Cloud costs are for the blog. I realised that cost was coming so during the month I decreased the size of the virtual machine that the blog was hosted on, and that cost should be less next month.

The costs for Lambda are for way exceeding the free quota of time and compute power. I suspect that is also related to the plays download.

The Relational Database Service costs are for the database, and that one will probably go higher next month. As mentioned in previous posts I had to upgrade the database during the month so that it could cope with the plays download load, and it’s still not doing that well. The good news is that for about double the cost I think I can start using AWS Aurora, which allegedly has very good performance. So when I organise a revenue stream for the site that might be resolved. Until then it will be coming out of my pocket.

As mentioned last month, it’s not too bad, but the costs definitely remain one of my priorities. When I start asking for Patreon contributions and putting ads on the site, you’ll know why. It’s not because I’m trying to profit from the site, but because something of this scale costs money under any circumstances.

August 4, 2018

On Our Selection

One of the more complicated features of the old site is a thing called selectors. They were an idea I thought of long after the initial code was written, but they were so handy to me that I retrofitted them a bit.

Essentially a selector is a rule that describes a set of games. As there are many sets of games involved in the site, having an easy way to describe them is useful. In the old site, it was mostly the case that the rules for choosing games for a feature were hard-coded, meaning that if I hadn’t chosen the right set of games for every interesting case, the site continued to suck.

However having had that idea once, I am too smart to let it go this time. So for the Favourites and Collection pages (the only ones that are working at the moment) I’ve encoded the rule about what games are displayed as a selector. When I get around to working on the table widget, I’ll make the selector editable, so then you’ll be able to choose your own set of games for each widget.

Another thing that has changed since I wrote the first site is a technology called Ajax, or XML HTTP Request. Those of you who are old may remember a time when you loaded a web page and you sat there and looked at it and it sat there and looked at you, and you were both stolidly passive. And then some pages started doing this thing where you would do a thing and they would change. Well the thing that enabled that was Ajax, which allowed the web page to load new data from the server after the page was loaded. That technology only entered the standards process about when I started designing the site, so I didn’t know about it and didn’t use it.

These days though, it’s very easy to do. It’s built-in to the technology I’m using to develop the widgets on the web pages (Angular, by the way), so there will be a whole lot of it going on. Which means that generally the widgets on the pages will be a whole lot more interactive and fun to fiddle with. And selectors is one of the things you’ll be able to fidde with.

There’s a test page for selectors, which has the documentation and a test bed so you can try selectors out. It’s supposed to be here:

https://extstats.drfriendless.com/selectors.html

At the time of writing I’m waiting for the CDN to find that page and make it available. As I add more selectors I’ll add them to that page, because that’s the page I’ll be using to see whether they’re working.

One more thing before I go. I went to a talk at MongoDB the other night, and they are using this library called Vega for their charting tool. I think I might have a use for it myself.

July 29, 2018June 2, 2020

Once More Into the Breach!

Cry ‘Havoc,’ and let slip the dogs of war;
That this foul deed shall smell above the earth
With carrion men, groaning for burial.

It really has been trench warfare this week. My last post was about the database crash. That took a day to get better by itself, but after discussing the matter with the AWS people on reddit, I decided that this was definitive proof that the database was too small, so I dumped it and got a bigger one. Which is a shame, because I think I already paid $79 for that small one.

Database Statistics — The database hovers between life and death for a week

Anyway, the bigger one is still not very big, but if I recall correctly it will cost about $20 / month. When I get some funding for the project I’ll probably upgrade again, but for the moment I’m struggling along with this one.

The graph above shows CPU used in orange. It’s good when that’s high, it means I’m doing stuff. The blue and green lines are the ones that broke the database during the crash, and they must not be allowed to touch the bottom. Particularly notice when the blue line hit the bottom it stayed there for most of the day and the site was broken. So let’s not do that.

So in response to this problem, I made some changes so that I can control the amount the downloader is doing from the AWS console. So in the graph, if the orange line goes down and the green line goes up, that’s because I turned off the downloader. And then later I turn it back on again. The initial download of games is about half done, so I expect another week or two of this!

On the other hand, the good news is that there are plays in the database, so I started using them. My project yesterday was the favourites table, for which I had to write a few methods to retrieve plays data. That bit is working just fine, and the indexes I have on the plays make it very fast.

The table comes with documentation which explains what the hard columns are, and the column headers have tooltips. There are other things about the table, like the pagination, which still annoy me, but I’m still thinking about what I want there. Some sort of mega-cool table with bunches of features which is used in all the different table features on the site…

That was a major advance, so I decided today to follow up with some trench warfare, and had another shot at authentication. This is so that you can login to the site IF YOU WANT TO. I went back to trying to use Auth0, which has approximately the world’s most useless documentation. When I implement a security system I want to know:

where do the secrets go?
how can I trust them?
what do I have to do?

Auth0 insists on telling you to type some stuff in and it will all work. It doesn’t say where to type stuff in, or what working means, or what I have to do. I know security is complicated, but that doesn’t mean you shouldn’t even try to explain it, it means you have to be very clear. It’s so frustrating.

Authentication dialog — You can sign in but why would you?

But anyway, after a lot of failures I got this thing called Auth0.Lock “working”, in the sense that when you click Login it comes up, you can type in a username and password, and then its happy. I get told some stuff in the web page about who you are.

The remaining problems with this are:

when the web page tells the server “I logged in as this person”, how do I know the web page isn’t lying? Never trust stuff coming to the server from a web page.
there are pieces of information that the client can tell the server, and then the server can ask Auth0 “is this legit?”… but I am not yet getting those pieces of information.
I have to change all of the login apparatus in the web page once you’ve logged in, to say that you’re now logged in and you could log out. But that’s not really confusing, that’s just work.

One of the changes I had to make to get this going was to change extstats.drfriendless.com from http to https. That should have been a quick operation as I did the same for www.drfriendless.com, but I screwed it up and it took over an hour. Https is better for everybody, except the bit that adds the ‘s’ on is a CDN (content delivery network) which caches my pages, so it means whenever I make a change to extstats.drfriendless.com I need to invalidate the caches and then wait for them to repopulate. And that’s a pain.

Nevertheless, I’m pretty optimistic that Auth0 will start playing more nicely with me now that I’m past the first 20 hurdles. Once I get that going, I’ll be able to associate your login identity stuff like what features you want to see. And then I will really have to implement some more features that are worth seeing.

July 23, 2018May 29, 2019

She Cannae Take Any More, Cap’n!

So, a couple of hours after I wrote the blog post last night saying how everything was going full steam ahead, it all blew up. This morning, many bits of the system which were working just fine are failing. This points to the database, which is at the heart of everything, and all indications are that it broke at about midnight.

I had a poke around, and eventually found the BurstBalance metric. In the top right graph, it’s the orange one that dives into the ground and bounces up.

What it seems to be is that if you overuse your database (in particular, the database’s disk) , you eat into your overuse credits, i.e. the burst balance. And at midnight I ran out of burst balance so the database stopped responding.

Well, that’s something I learned today. At least now I know to watch this when the system is under proper load. It’s also a good indication of when it’s time to fork out for a bigger database.

July 22, 2018

Full Steam Ahead!

I’ve been beavering away on plays downloads. There were a couple of bugs so the downloader was stuck for most of the week, but this afternoon I got it working properly (this time for sure), so I cranked up the pace. I told the system to do 100 downloads per minute. It had 231900 to do, so it’ll still take a while – maybe 4 days, unless there are more bugs.

Anyway, that’s to complete that job. As some plays have already been downloaded, I’ve started populating the SdJ column in the War Table, and Total Plays in the Rankings table.

To populate the SdJ column I needed to know what the series were, so I coded up the bit that downloads the series metadata as well. I’ve dumped the old Catan / Carcassonne / Command & Colours, etc series, as they were getting silly and nobody cares. And if somebody does care I can put them back.

When the plays data is ready, I’ll get the War Table and the Rankings Table working properly, and then I’ll be in a position to implement some of the other features of the old system. I have a plan in mind for constructing pages of features, which I will experiment with a bit when I have some features to do it with.

I hope it gets more fun after this. I mean, it’s kinda fun for me to see it all coming together, but it’ll be better when it’s fun for you guys too. Here’s a pretty picture:

Graph of Lambda invocations over time showing sharp increase — Lambda invocations take a salmon leap

July 21, 2018

Watch Out, There Might Be COOKIES.

I was watching ABC TV this morning, and some commentators from “Download This Show” were talking about digital privacy. They mentioned that cookies were used to track you all across the internet and invade your privacy blah blah blah. As I work closely with digital marketing people I have to know a bit about that sort of thing, and I’m not scared of it, but I figure I should tell you guys a bit about potential privacy things.

First of all, cookies. The old site has cookies. I use them to store information you want me to know about your preferences, e.g. screen size and what features you want on your custom page etc. Whenever you come to the site, your browser sends me the cookies and I look in them to see what to do. I don’t know your identity, though it is extremely likely that if you’ve put your BGG user name in the cookie that you’re that same person. But you could put my BGG user name in there if you like. That’s about all cookies are good for.

However, the privacy stuff gets a bit more interesting once you combine it with “pixels” and tracking. A pixel used to be exactly what it says. The page includes a one pixel image which is too tiny to be seen really. However, that pixel is loaded from Facebook. So when your web browser goes to load that pixel from Facebook, it gets told “Facebook user John Farrell requests this pixel because he’s looking at extstats.drfriendless.com” or whatever. So then Facebook knows what you’re looking at, because the person who made the website (me) put the Facebook pixel on the page.

This is very very common on the internet. You’d be appalled. I have a browser plugin called WASP.inspector which tells me how many pixels get installed by a page. news.com.au just downloaded 295 of them in the page. Practically every site you go to has such stuff on it.

Now what the Facebook pixel does is reports back to Facebook that you went to this page. It doesn’t tell me who you are, that information stays with Facebook. But it does mean that if someone goes to Facebook advertising and says “I want to sell stuff to people who like board games”, Facebook can identify you as such a person. Facebook does not tell the seller who you are, they just take that seller’s ad and stick it on your page. So it’s really only Facebook who knows everything about you, and hey, you knew that Facebook knew that stuff anyway, didn’t you?

So, on drfriendless.com I use this thing called Google Tag Manager, which is a place where I can configure all of the pixels I want to dump on you, i.e. all of the third parties who will find out that you visited my site. As I have no particular need to tell Facebook that you were there, there is no Facebook pixel in my GTM, so Facebook does not know that you came to DrFriendless.com. The only thing I do use is Google Analytics.

Analytics is a Google product which tells me a bit about my visitors. Some of the graphs it produces are shown below. Now you know that a person like me who creates a site like this wants those stats! So that is the evil privacy-invading tracking that I’m doing.

As for the future, I intend to be very transparent about privacy. The new General Data Privacy Regulations in Europe kinda require it, as I am operating in the European market, and I think the rules they have are sensible. So I’ll comply with them as much as I can from the ground up.

I also intend to allow users to log in to the site. This won’t be a requirement to get your stats, that will be public as always, but there are some ideas I have that require that I associate data with you, and for that I need to know your identity. And like the cookies on the old site, there will be only circumstantial evidence that links your account on drfriendless.com to your BGG account.

And as for pixels, I think it’s fair to say that I’ll tell you if I add more tracking pixels to tag manager. It might be, for example, that I add the ability to play games online to the site, and one day I need to advertise on Facebook to get more users. So then I could add the Facebook pixel to the site, and tell Facebook “I want more people who are like the people I already have”. I think that would be a reasonable use case.

Oh, and finally before I go, I should tell you that you can block this tracking stuff. I used to run Chrome extensions called Ghostery and Disconnect that stop the pixels, although I don’t know how. I know they worked though, because for days I couldn’t get work’s Google Tag Manager integration happening on my laptop. I eventually realised that if I wanted to test tracking pixels I had to stop blocking them. Hence now wherever I go all of my privacy gets invaded. On the other hand, Facebook ads do occasionally show me things I want, which is a pleasant change.

July 18, 2018

Let There Be Plays!

Sometime recently the downloading of games completed (again), and the list of all things that are expansions of what things became completeish. That meant that I could now download plays, and then infer plays of base games from plays of expansions. So at about 10 o’clock this morning I turned on downloading of plays.

Coincidentally at about 10 o’clock this morning I started leaking database connections. I discovered that I could change the maximum connections that the database allows, so I went from 66 to 200, which just meant that I leaked all of those as well.

So I must have put a connection leak in somewhere, but as it’s now 10:30pm, the connection leak is going to stay there until tomorrow evening when I get time to look at it. And it means that most of the site is broken because it can’t get data from the database. On the other hand, 47000 plays have downloaded successfully. I expect there will be in the order of 20 million, so if I don’t get that bug out it’ll take forever.

July 15, 2018July 21, 2018

An Offering of War

“The Offerings of War” is a statue at the entrance to the Art Gallery of New South Wales. I”m not a really arty person, but I do like that statue and many other exhibits in that gallery. However that is not what this blog is about.

Whenever I can get a spare 36 hours I’ve been working on the new site. The downloader is working quite well, chugging away gathering data, and it had caught up to me, in that it had downloaded everything I had told it to and I needed to tell it to do some more stuff. In particular it needs to download plays. It knows there are 231064 (actual number) instances of a geek having played games in a month that it needs to get data on, but I hadn’t written that code yet. So I started doing that.

Getting the raw plays data is relatively easy, but Extended Stats does not run from raw data. Extended Stats tries to be smart and says “Hmm, he played Roll for the Galaxy: Ambition, so he must have played Roll for the Galaxy as well. I’ll put that in.” Which is all well and good but I hadn’t done the bit where I record which game is an expansion of which other game, so I had to do that first. So now I am reprocessing 61575 games to find out what all the expansions are. It’s working well, but there are 45036 to go.

And then I wrote the plays logic, but I won’t start running it seriously until the expansions are done, or it will look silly. And until the plays are downloaded, there will be gigantic holes in the data.

I was a bit stymied in that direction, so I turned my efforts to the user interface. From the user point of view, the web pages are all there is to Extended Stats, so I have to deliver some of them at some point. One thing I was missing was the War Table. Originally called the Pissing War Table, and renamed to suit the sensibilities of those who weren’t raised in the Australian scrub like I was, the War Table is where geeks can compare their geek cred. It also serves as a directory of everyone on the site and a way to find a link to your personal page. Here’s the new page:

http://extstats.drfriendless.com/wartable.html

The War Table boasts a number of cool features. First of all, I plan to be reusing the table component in a lot of pages, so I forked my own version of it. The plan is that I will be able to upgrade the table which will improve all of the places which use it. – for example, I would like a search box which you can use to find a particular entry in the table. However no such upgrading was done today.

I then added tooltips to the column headers, because I know that these columns can be a bit meaningless if you don’t know what they are. And then I went the whole hog and added a Documentation button, which opens up into a few tabs of doco where I can explain exactly what’s going on.

I’m not completely pleased with the War Table. It doesn’t look exactly how I want it. The pager buttons down the bottom are the wrong colour and I can’t figure out why. There are rounded corners on things when I want square. However I’ll probably bring those pager buttons up to the top of the table, and somehow show the Documentation button, the page flicker, the page size chooser, and the search box all into one row. However domestic duties call, and I must make an Offering of Peace to Scrabblette.

July 8, 2018

It Don’t Matter Who You Are, Just So Long As You Are There…

I’ve been working on more invisible stuff. That’s why although it looks like there’s no change, I’m still exhausted and pissed off. This afternoon’s adventure was with a user login capability. As it’s security stuff, it’s confusing and mostly seems like useless guff… but as I know so little about it, I just code how I’m told.

Now the old site doesn’t have user logins, so you might think that the new site doesn’t need them either. On the other hand the old site uses way too many cookies, and I’d like a more robust solution than that. Also I have ideas for very cool features that produce user-specific information that they would want to keep and edit later. So I have to have user login. It won’t be required to use the site, but it will be required for features that need to store information on a user’s behalf.

There are third-party packages that can handle these things for you. Auth0 is one of them, so I jammed in some Angular code to allow users to login to the site with their Facebook or Google credentials. And that sort of worked a bit. But then Auth0 called me to tell me someone had logged in and it all went pear-shaped. See Auth0 calls me at a URL that I specify, and it sticks the user’s credentials on the end of the URL. However as that URL went through an API Gateway to get to my Lambda, AWS lost the URL and I didn’t get the information I needed. I think that’s a design flaw on their part. Apparently there are ways to get around that, but as I was trying to understand them I found another option.

AWS has a service called AWS Mobile which offers a suite of user-attached features, such as authentication, profile photos, and a bunch of other guff that I ignored. You see, my requirements are trivial beyond belief – I just want to know if this person logging in is the same person who logged in some other time. I don’t care for their name, email address, blah blah blah. I just need an opaque token that I can save in the database, and when I see that token again I get the settings out of the database and start using them for that user. Nobody seems to design for such a simple use case.

Anyway, I signed up to my own site using the AWS Mobile widget, and I appeared in the user pool on the back-end. Hooray! The widget doesn’t behave very well, so I’ll have to explore that, and I still haven’t figured out how to get the opaque token I was wanting, but the documentation seems nice. Though I do get sick of being told how to install stuff, I’d much prefer to hear what it does and how it’s used.

So that was my Sunday afternoon. Stuff is progressing, slowly slowly.