Francis’s news feed

This combines together some blogs which I like to read. It’s updated once a week.

March 25, 2017

BBC Research and Development: The Ethics of Data by Adrian McEwen

We were reminded the other day (when it popped up on the Adafruit blog) of a project from BBC Research and Development that Adrian took part in that we never got round to sharing on here.

The BBC R&D team have been working in and around the Internet of Things for years (not least with the Perceptive Radio that we built for them) and have a wider interest than just looking at how connected devices can work in and around broadcasting and media.

Back in 2014 they gathered together an impressive roster of experts and filmed interviews with them discussing the issues around the ethics and challenges of data, privacy and the Internet of Things.

Afterwards they distilled the interviews into a series of short films, each focused on a different theme.

All of the films are available on the Ethics of Data playlist and there’s also an introductory blog post

The BBC plays an important non-commercial role in British life, and it’s good to seem them tackling such topics.


Identity Poetics by Dominic Fox

The boy who rode on slightly before him sat a horse not only as if he’d been born to it which he was but as if were he begot by malice or mischance into some queer land where horses never were he would have found them anyway. (Cormac McCarthy, All The Pretty Horses)

I’m going to talk here about the sense of “fit” that I feel when reading something written from a subject position that seems to resemble mine, the feeling of excited recognition, and how far or in what ways it can be trusted. I had that feeling, or a feeling in that family of feelings, this morning as I read a piece of writing by Melanie Yergau, a self-identified “spectrumite” (that is, someone on the ASD spectrum). It’s a really take-no-prisoners piece of writing, superbly spikily uncollegial, and it filled me with shock and delight. Even as the essay talks about the absurdity of portioning the world of discourse up into communities bounded by sharp circumferences, I feel a proximity to it, or a desire for proximity towards it: it may not demarcate a crisp circle in a Venn diagram, but it opens up a rhetorical space that I feel comfortable in partly because it is ringed about with enough apotropaic barbed-wire to keep hostile forces at bay – and I feel myself comfortably on the right side of that barbed wire, enclosed rather than repelled by it.

At the same time I’m ambivalent about that sense of comfort, which seems simultaneously tempting and presumptuous. On the one hand, it’s tempting to say that it’s no accident that I feel this sense of familiarity and identification: something in me responds to something about that written voice, the experience and perspective it articulates, and this must be because I have something in common with the speaker; perhaps it is that I am also a “spectrumite” (as I’ve believed for a while, in fact). On the other hand, it is unavoidably true that it is also somewhat of an accident. This discourse, this manner of speaking, might never have existed, or I might never have come across it. There is nothing inevitable or necessary about the articulation that has formed between my way of feeling and this form of expression. My identification with it, my use of it as a mirror in which I think I see something about myself reflected, is opportunistic, and might even be seen as voyeuristic or exploitative. One of the concerns of Yergau’s piece is the desire of the “typical autistic essay” to maintain a sharp distinction between those who may legitimately picture themselves in that mirror, and those who are copping an attitude (so to speak). I’m not sure I can reliably distinguish in myself between sincere, well-founded, defensible identification, and trying a position on for size. I’m not not copping an attitude.

I would like, it is true, a bona fide, argument-ending retort to the suggestion that I am simply impossibly stubborn, or was inadequately socialised – that is, whipped into shape – as a child. I know that there is something odd about me that is as incorrigibly resistant to force or persuasion as some people find their sexuality to be. You can squelch it or deform it, but you really can’t make it go away. The world of my childhood was quite sharply divided into people who recognised that this was the case, and people who didn’t; and the latter were and remain an enemy between whom and myself there can never be enough barbed wire. My oddness doesn’t excuse me from making adult decisions about how to function in the world, but it does impose upon me the necessity of weighing up questions such as “will putting myself in this situation cause me to have a frightening and embarrassing meltdown” to which the answer is sometimes “yes” irrespective of all the very good reasons why I ought to just knuckle down and get on with it. The task of cultivating “coping strategies” is, let’s say, ongoing.

It’s not just a matter of mapping internal states, but of giving names to experiences. The experience of being aspified or autismatised, if you like. A while back I was standing with a group of friends, all computer-y people, talking about the sorts of things computer-y people like to talk about. We were relaxed, companionable, fluidly interacting. Then some non-computer-y people came over and started talking to us, and we were instantly “the nerds”. It was efficiently, if subliminally, communicated to us that our confidence in our own social presentation was ill-founded, that in fact we should be ashamed of ourselves for existing. After we prickled at them a bit, the aggravating people (who said nothing outwardly aggressive or harassing) went away and we were able to recover ourselves somewhat. But I was strongly reminded of how much “nerdiness” is a social relationship, a hierarchical relationship, rather than an essence one carries around inside oneself. This wasn’t, in my reading, just ingroup-meets-outgroup (which will always shift the tone somewhat). It was very distinctly an experience of being put in one’s place.

There is always something a bit vicarious, a bit unstable, about identification. Which doesn’t mean that we’re all mistaken in our identities – it’s more that there’s a constitutive instability to the way in which society at large constructs and projects identity categories. We live through others’ ambivalence towards us, and are sometimes forced to bear the consequences of their attempts to resolve that ambivalence to their own satisfaction. That motive seems to me to be at the root of most of the psychic violence that has ever been directed towards me, in any case (and is undoubtedly also at the root of much of the psychic violence that I’ve doled out in turn). It’s a relief to escape from that ambivalence into a positive assertion of identity, but I am always haunted by the feeling that this is a sort of mis-step. The prerogative of deciding belongs to others: when you’re acting peculiarly, you’re being a bit autistic (which is a problem you need to work on); when you’re affirming neurodiversity, you’re not nearly autistic enough to own that position. This is perhaps a better problem to have than the problem of being positioned as irremediably low-functioning, but it’s generated by the same value system. I’ve learned from my wife’s research that there’s a similar catch-22 around class identity, where those with the cultural clout to challenge classist stigma are invariably defined out of the stigmatised group (“not really” working class, or working class but not really impoverished, etc), and can be variously accused of copping an attitude, maintaining a chip on the shoulder that really ought to have been brushed off by now, or illicitly appropriating others’ genuine misery.

It was impossible not to wince when I was introduced to a friend-of-a-friend who had done therapeutic work with autistic children with “oh yes, Dom’s a bit on the spectrum himself” – as I sat in a noisy pub, drinking beer, not noticeably twitching (the beer helps), and even remembering to make eye-contact from time to time. Her diplomatic “well, we’re all somewhere on the spectrum” was I suppose the nicest possible way of telling me to pull the other one. It’s true and it’s not true, because neurotypicality is a thing and not everybody is that thing, and because the prevailing assumption is that neurotypicality is healthy, functioning and socially viable in ways that non-neurotypicality is not. Where some of us are on the spectrum – even if it’s a long way from being constantly excruciatingly over-stimulated by everyday situations – is a problem. Identification with the problematised position comes with a shock of delighted recognition when you see that it is after all possible to speak from within its contradictions.


Temperature catches sunburn by Goatchurch

We move on to the temperature sensor work, and the controversial concept that the temperature of the rising air in a thermal is hotter than the non-upwardly-mobile surrounding environmental atmosphere.

I say it’s controversial because the meaning of “the temperature” in relation to a mobile and turbulent airmass whose structure spans hundreds of metres in the vertical dimension and thousands of Pascals in relative pressure is undefined. Remember that the adiabatic temperature differential is about 0.7degrees per 100m change in altitude.

We do, however, have a single point temperature sensor on a mobile glider which is also progressing up and down the air column (depending on the wind currents and skill of the pilot). The location of the glider with the single temperature sensor is imperfectly known (due to bad GPS (see previous post) and inexplicable barometric behavior (see next post)), and the sensor itself has a thermal mass which means its readings have a delay half-life of about 9 seconds in flowing air (see this other post).

I have taken the precaution of jettisoning my slow, accurate and low resolution dallas temperature sensor for two humidity/temperature sensor combos and an infrared thermometer that has quite a good ambient temperature sensor within its metal can.

These sensors tracked one another pretty well, except for one small problem when I spotted a series of spikes in one and then the other humidity/temperature sensor.

temperaturespikes

What is going on?

I identified these spikes at times 15:42:49, 15:43:07, 15:43:46, 15:43:57, 15:44:07, 15:44:17 and 15:44:27 and plotted them on the more-consistent-with-the-evidence-at-this-moment 6030 GPS (seen on the right), but had to include a 2.5 second offset to my flight logger’s GPS times to make the timing positions consistent with those running in my flight-logger. (I still can’t see how it’s underplaying the arrival times, seeing as the timestamps should be coming from the GPS satelites, and competition pilots would prefer their computers to shift them forward in space if they were choosing any direction.)

spikepositionsgps

The blue dot is at the start of the time window, and the 10Hz sample GPS on the left gets the first two left hand loops badly wrong, but then looks okay for the five right hand loops. (The right hand vario GPS makes a reading only every 2 seconds.)

The two green diamonds coincide with the spikes in the si7021 humidity meter, while the five red diamonds are the spikes on the SHT31 humidity meter. The coloured line segments are the axis of the vertical king-post, so they are perpendicular to the plane of the wing.

When turning, the glider banks inwards, but there is also an angle of attack, so the perpendicular line to the wing is going to rake backwards from the heading.

You can see why I was getting nowhere for the past year with traces like on the left where I didn’t know that the GPS was implying a velocity that was 90degrees out from the truth.

Now the temperature sensors were recessed on the bottom of the device, out of sight of the sunlight, so I thought, but looking at the video you can see the sun is quite low to the horizon and could have snuck in there.

According to the suncalc.org, on this day at this time, the sun was 19degrees above the horizontal in a direction of 233degrees (West of Southwest).

sunpos

The sensors are mounted spreadwise to expose them to the airstream, with the Si7021 on the right and the SHT31 on the left, and potentially exposing the former to direct sunlight on a left bank (during a left turn) and the latter to sunlight on the right bank (during a right turn). This is indeed consistent with the 2 spikes then 5 spikes pattern in the record above.

Not every tight turn scores a hit from the sun. Why?

I can filter out the low frequencies in the temperature signal to just indicate the spikes (they get doubled up unfortunately, but there’s only one spike there), and plot then on the line at the same time as the roll/bank of the glider in degrees, like so:

g = fd.pG.tG[t0:t1]
fg = FiltFiltButter(g, 0.05)
plt.plot((g - fg)**2*1000, label="si7021")
plt.plot(np.degrees(np.arcsin(SinRoll(-fd.pZ[t0:t1]))), label="rollangle")

spikesonroll

This shows a wobbly bank left followed by a wobbly bank right, where the wing fluctuates between 20 and 50 degrees (I wonder if this is inefficient flying or not?) The spikes/sun-strikes occur in the middle of the higher bank angles.

Here is a slightly busier plot of the same thing, with the orientation heading plotted on it.

plt.plot((NorthOrient(fd.pZ[t0:t1]))*0.1-36, label="heading-deg/10")

spikesonrollheading

The heading of the glider is in degrees between 0 (north) and 360 (north again). The plot has discontinuities as it passes the northerly direction, first two turns counter-clockwise to the left and then multiple turns clockwise to the right. I’ve offset the plot down by 360 and divided it by 10 so it plots on the same graph.

You can see that I only get a spike when the glider is heading north at the same moment as there is a banking angle of over 40 degrees. Otherwise I presume a shadow continues to be cast over the sensor.

Well, we can sort that out:

tempcover

Now, hopefully this doesn’t block off the flow of air past the temperature sensors and ruin my carefully calibrated response curves.

Also, you’re asking, what’s with the infra-red temperature sensor exposed in the funnel looking at the ground?

Well, that’s easy. All I need to do is calculate its field of view of ground area from the landscape height field and the solar heating from the sun direction on the contours of the ground, and then it’ll tell me if the ground under that one pixel of view is too hot or too cold, like it’ll mean anything.

This is not quite such a stupid idea as I thought.

Here’s the graph of the infra-red sensor pointing at the ground temperature vs the altitude.

irtemp

So, the IR temperature is reading about the same throughout. The oscillations when the glider is climbing is due to the circling to stay in the thermal with a bank angle and sweeping different parts of the landscape.

But like all good ideas in sensor land, there’s no quick win. Here’s plotting the times when the glider is flying level with red dots when the ground view temperature is hotter than 10.3degrees, and blue otherwise:

t = 10.3
q = fd.pQ[t0:t1].copy()   # GPS positions
q["m22"] = utils.InterpT(q, fd.pZ.m22)  # Z axis of kingpost
q["tI"] = utils.InterpT(q, fd.pI.tI)    # IR temperature
fq = q[(q.m22<-0.95) & (q.tI < t)].sample(200)
plt.scatter(fq.x, fq.y, color='b')
fq = q[(q.m22<-0.95) & (q.tI > t)].sample(200)
plt.scatter(fq.x, fq.y, color='r')

irtempground

The higher ground is to the Southeast in this picture where the IR temperatures are higher, which is wrong! Maybe I really do need to factor in the angle of attack and the height of the ground. This is not inconceivable.

So although not one thing worked properly in this study to date, I’m getting much more confident of these data tools and that they can get me there — if there is a there there!


GPS is a jerk by Goatchurch

Last week I finally had my first flight of the year with my newly build flight data logger. I can’t believe the number of issues it’s already thrown up.

At least I may be making quick enough progress to get past the issues (rather than being swamped by them) using this exceptionally efficient Jupyter/Pandas technology.

For example, my code for parsing and loading the IGC file is 15 lines long.

The code for loading in my flight logger data into a timeseries is just as brief, if you consider each data type individually (there are more than 13 of them from humidity sensors to an orientation meter).

The GPS time series from my flight logger (at 10Hz) can be crudely converted it to XYs in metres, like so:

# pQ is the GPS position pandas.DataFrame
earthrad = 6378137
lng0, lat0 = pQ.iloc[0].lng, pq.iloc[0].lat
nyfac = 2*math.pi*earthrad/360
exfac = nyfac*math.cos(math.radians(lat0))
pQ["x"] = (pQ.lng - lng0)*exfac
pQ["y"] = (pQ.lat - lat0)*nyfac
plt.plot(pQ.x, pQ.y)

gpstrack1

Note the suspicious sharp turn near (-1000, -400). Here’s another sharp turn somewhere else in the sequence covering a 1 minute 5 second period using time slicing technology:

t0, t1 = Timestamp("2017-03-09 15:42:55"), Timestamp("2017-03-09 15:44:00")
q = fd.pQ[t0:t1]
plt.plot(q.x, q.y)

gpstrack2

The dot is at the start point, time=t0.

It’s taken me days to conclude that this is complete bollocks.

The first piece of evidence was the summation of the velocities made by the GPVTG records of velicity and heading (degrees), which we can sum cumulatively like this to recreate the path:

v = fd.pV[t0:t1]
vsx = (v.vel*numpy.sin(numpy.radians(v.deg))).cumsum()
vsy = (v.vel*numpy.cos(numpy.radians(v.deg))).cumsum()
plt.plot(vsx*0.1, vsy*0.1)  # *0.1 because the readings are 10Hz

gpstrack3

No impossibly sharp U-turns there, but it does claim I did an S-turn, first to the left to head south before turning sharp right to turn north. At no time does this show me heading east.

This was not consistent with the BNO055 orientation sensor, which reads at 100Hz and which can be summed in the same way as the GPS velocity on the assumption that the glider heads at 10m/s in the direction it is pointed. (Don’t worry about the wind. It was very light that day.)

dorient = utils.NorthOrient(fd.pZ)[t0:t1]
vosx = numpy.sin(numpy.radians(dorient)).cumsum()
vosy = numpy.cos(numpy.radians(dorient)).cumsum()
plt.plot(vosx*0.1, vosy*0.1)

gpstrack4

This is a lot more realistic, as well as exhibiting a 270 degree left hand turn not shown by the GPS.

It is, however, consistent with the video footage from my badly misaligned keel camera.

The GPS does not always produce bollocks. How do we find out when it is good?

We can calculate the heading in degrees from consecutive locations and compare it to the velocity heading to observe that sometimes they do agree and sometimes they are very far out, like so:

q = fd.pQ[t0:t1]
vq = q.shift(5) - q.shift(-5)  # 1-second range at 10Hz
qdeg = (numpy.degrees(numpy.arctan2(vq.x, vq.y))+180)
qdeg.plot()
fd.pV[t0:t1].deg.plot()

gpsvelcompare1

The same can be done in the velocity domain:

numpy.sqrt(vq.x**2 + vq.y**2).plot()
v.vel.plot()

gpsvelcompare2

And here’s how it resolves itself in the vy domain, which is going to be easier to compare:

vx = v.vel*numpy.sin(numpy.radians(v.deg))
vy = v.vel*numpy.cos(numpy.radians(v.deg))
vq = q.shift(-5) - q.shift(5)
vy.plot()
vq.y.plot()

gpsvelcompare3

The readings and derived readings are a bit hard to compare like this, so I’ve interpolated the velocities to the gps position time series (because they have slightly different time-stamps) and low-pass filtered them to get rid of some of the noise and width alignment that would interfere with the comparison.

vyi = utils.InterpT(vq.y, vy)
plt.plot(utils.FiltFiltButter(vyi, 0.015), label="posvel")
plt.plot(utils.FiltFiltButter(vq.y, 0.015), label="realvel")

gpsvelcompare4

What we would like to work up to is a method of selecting out the sets of points in the gps sequence that are bad so we don’t waste time attempting to process bad data.

Here’s how we plot the sequences of points (in black) where the gps position and velocity disagree by more than 4m/s in either dimension.

# from the position, get the velocity in x and y
q = fd.pQ[tf0:tf1]  
vq = (q.shift(-5) - q.shift(5)).dropna()
    # drop the 5 end-values as they interfere with the filter

# from the velocity calculate the x and y
v = fd.pV[tf0:tf1]
vx = v.vel*numpy.sin(numpy.radians(v.deg))
vy = v.vel*numpy.cos(numpy.radians(v.deg))

# interpolate to position series and filter the difference
vxi = utils.InterpT(vq.x, vx)
vxd = numpy.abs(utils.FiltFiltButter(vxi - vq.x, 0.015))
vyi = utils.InterpT(vq.y, vy)
vyd = numpy.abs(utils.FiltFiltButter(vyi - vq.y, 0.015))

# *advanced pandas code here*
# drop the 5 end-values to allow comparison
# and subselect against boolean timeseries specified by when the 
# difference in smoothed velocities exceeds 4 in x or y axis
badq = q.iloc[5:-5][(vxd>4) | (vyd>4)]
plt.plot(q.x, q.y)
plt.scatter(badq.x, badq.y)

gpsbadvel1

And this was when I remembered that I was carrying a second GPS in my 6030 vario device and had to quickly write a parser for its IGC file, to plot it as an overlay like so:

gpsbadvel2

So it’s plausible that by comparing the GPS position differences to the GPS velocity we’re able to spot the bits where the GPS readings are pants so as not to waste any effort trying to make sense of it, because there is no sense.

For a closer look, here’s how they line up in that 1 minute period shown above:

gpstrack5

How about comparing these two GPS readings to each other?

The vario GPS reads once every 2 seconds, while my flight logger GPS reads 10x a second, but with a bit of pandas code, we can align GPS timestamps u (rather than by the microcontroller’s millisecond timestamp) and subselect accordingly:

useries = g.index.to_series().apply(lambda X: X.hour*3600+X.minute*60+X.second+0.0)
dfg = pandas.DataFrame(data={"lat":g.lat, "lng":g.lng, "u":useries})
dfg = dfg.set_index("u")

dfg["qlat"] = pandas.Series(list(q.lat), q.u/1000)
dfg["qlng"] = pandas.Series(list(q.lng), q.u/1000)

plt.scatter((dfg.lat - dfg.qlat)*50, (dfg.lng - dfg.qlng)*50)

gpsbadvel3

It’s a donut!

There’s something suspicious about donut errors because it looks like something is chasing its tail.

Let’s try adding on 2 seconds to the datalogger data to see how it compares:

qu = q.u/1000+2.0
dfg["qx"] = pandas.Series(list(q.x), qu)
dfg["qy"] = pandas.Series(list(q.y), qu)
plt.scatter((dfg.x - dfg.qx), (dfg.y - dfg.qy))

gpsbadvel4

And if we add on 4.0 seconds we get the donut back:

gpsbadvel5

We can loop through to find the offset which has the minimal error, like so:

varss = [ ]
for i in range(40):
    qu = q.u/1000+i/10
    dfg["qx"] = pandas.Series(list(q.x), qu)
    dfg["qy"] = pandas.Series(list(q.y), qu)
    varss.append((dfg.x - dfg.qx).var() + (dfg.y - dfg.qy).var())
plt.plot(varss)

gpsbadvel6

Turns out this is at 2.1seconds.

What basically happens is that the timestamps in the vario match the position given in the timestamps by the flight logger two seconds earlier.

I don’t know if this (or the horrendous directional errors) are caused by setting the GPS to sample at such a high rate (10 times a second), or it being a comparatively bad device, or what.

It appears I can’t trust the GPS readings to even log my direction to within 90degrees accuracy, so there’s no point in using it in relation to the glider kinematics (turning circles etc). This explains why all of my orientation kinematics attempts have fallen flat.

The GPS velocity method for sifting out bollocks sets of readings is plausible, but probably too noisy and messy.

Given that different GPS devices can at times disagree, it looks like I could bolt on one or two extra GPS chips into my unit and maybe run them at a different update frequencies. And then only run kinematics models during periods of time when they all agree.

To recap, the challenge is to discover a device that can inform you, the pilot, of three things in real-time: (1) the glider’s performance, (2) your own performance, and (3) important facts about the airmass. This should make a difference for people like me who don’t have the talent and good fortune to be able to practice flying for hundreds of hours per year.

Next up, air temperature, sunlight and very shoddy barometer readings.

Update

A long afternoon of bodging code, resoldering wires to access serial pins and whole stick of hot glue and I now have a second GPS stuck on to the top of the system.

secondgps

Nobody has ever accused me of wasting too much time on design, but you’ve got to keep focused on the problem.

There is still scope to add on a third GPS chip should it be required, but this will hopefully not be necessary. The point is not to have an accurate position all the time, it’s to know when the position is not at all accurate so I don’t attempt to base any calculations on it when it is bad.


Government Just Gave Your ISP Even More Power: You Can Take it Back! by Albert Wenger

Yesterday the Republican-controlled Senate voted to allow ISPs to sell customer data including browsing history without prior customer consent. I tweeted that in response it is “Time to tunnel all home traffic through a proxy.” First let me explain for a less technical reader what that means. At the moment your ISP can see every web request you make. They can’t see inside encrypted requests and much of the web is encrypted these days (https versus http). But they can see whether you are going to Netflix or to Hulu. So being able to sell this information is valuable. Hulu might want to target their advertising at households that don’t already use Hulu. If you establish an encrypted tunnel to a proxy first, then all your connections goes through that proxy and your ISP no longer knows anything about which sites or services you are accessing.

Many people have objected to ISPs being able to do this with a privacy argument. As readers of this blog or my book “World After Capital” know, I think privacy is a red herring that is leading us down a path towards controlled computation. So on what grounds then am I objecting to this? Quite simple: this is a further abuse of monopoly power by broadband ISPs. I live on 22nd Street in the Chelsea section of Manhattan and I have zero choice in broadband providers. My only available option is Spectrum, formerly known as Time Warner Cable. That’s it. Verizon Fios is not available on my block. And even if it were, the two have stopped competing meaningfully with each other.

I already pay a lot of money for relatively poor bandwidth. As someone from Chattanooga pointed out they have a homegrown municipal ISP that provides them with 10GB.  I just ran Speedtest and got a measly 6.6 MBps downstream (and before you comment, yes, I have a fast router and switch setup and this is far below the speeds my wifi supports). So instead of allowing my monopoly provider to make even more money, I am looking for the opposite: better service at a lower price.

There are two routes toward that end goal. In the long run we can and will have more competition in the local access market. This will happen as wireless technology improves and as some competitive offerings enter the market. At USV we have funded Pilot Fiber and bought a stake in Tucows (both deals led by my partner Brad). This will likely take decades though to play itself out. In the meantime, we need regulation that limits how my local monopoly (or maybe duopoly) can use its market power to extract economics from customers and distort access to the internet. This is really the same reason I have been a longstanding proponent of net neutrality.

So coming back to the idea of tunneling all traffic through a proxy. If you choose a free proxy, please realize that it is free in all likelihood because they are selling your data! After all operating a proxy costs them money. If you set up your own proxy or use a paid service you will now have cost on top of your ISP bill. So why do I still recommend doing this? I see it as an act of protest against undue market power. It is a way of individually reclaiming some of the power that the government has just stripped from us and transferred to our monopoly providers. A tunnel removes the ISP’s ability to make more money on your data and it reestablishes net neutrality for you (as the ISP can no longer treat traffic from different sources with different priorities).


Uncertainty Wednesday: PSA Test Example (Part 4) by Albert Wenger

Last time in Uncertainty Wednesday, I announced that we would look at the PSA Test Example using absolute numbers instead of probabilities. Now you may recall that we defined probability as how likely something is to happen. And in introducing the PSA Test Example, I provided the probabilities for the elementary events for a 50-year old male as follows:

P({AL}) = P(healthy *and* low PSA) = 0.907179
P({AH}) = P(healthy *and* high PSA) = 0.089721
P({BL}) = P(cancer *and* low PSA) = 0.001519
P({BH}) = P(cancer *and* high PSA) = 0.001581

How do we go from these to absolute numbers? What we want to know is what happens if we look at a group of 10,000 50-year old males assuming that these probabilities apply equally to ever male in the group. Put differently we are assuming that one male in the group having cancer does not make it more or less likely that another does and one receiving a high signal does not make it more or less likely that another does. It is important to recognize that this embeds many assumptions, such that this type of cancer is not contagious and that the test equipment in use is working properly each time!

With this assumption, we can simply apply the probabilities as fractions to the total population. And with a bit of rounding we get the following counts which I will denote using N as in “number of”

N({AL}) = N(healthy *and* low PSA) = 9072
N({AH}) = N(healthy *and* high PSA) = 897
N({BL}) = N(cancer *and* low PSA) = 15
N({BH}) = N(cancer *and* high PSA) = 16

You can easily verify that this adds up to 10,000.

Now let’s revisit the question about the conditional probability that you have cancer (B) after receiving a high PSA level signal (H). We can see that in total there are N(H) = 897 + 16 = 913 people who receive an H signal. Of these only N({BH}) = 16 have cancer. So the likelihood of having cancer conditional on a high PSA level is

P(B | H) = N({BH}) / N(H) = 16 / 913 =  0.0175

or 1.75% which is pretty much the same as the 1.73% we found previously (the difference is the result of rounding when we went to absolute numbers).

Many people find it much more intuitive to think about absolute numbers. You should try out using the absolute numbers to answer the other question which was “How likely is it that you do have cancer (A) even though your PSA level was low (L)?” (left as an exercise for the reader – I love saying that!).

Keep in mind that given the assumption of the probabilities applying equally to each member of the group, using absolute numbers is really just a scaling of the probabilities (in our case by a factor of 10,000). To answer the conditional questions we wind up forming fractions in which both the numerator and the denominator were scaled by the same factor and hence the factor immediately cancels back out. You should convince yourself of this by using 100,000 or 1,000,000 as the size of the group.

Next week we will wrap up this example by looking at two measures, called sensitivity and specificity, that are widely used to assess the quality of medical tests (meaning how strong a signal is the test producing).


When Will Blockchain-based Decentralization Matter? by Albert Wenger

I became interested in Bitcoin reasonably early and was fortunate to invest in some early mining (although as with all good investments in retrospect not nearly enough). There is currently a fight brewing in Bitcoin world with the possibility of a hardfork a la Ethereum into two different currencies. Much like the Ethereum fight and resulting hardfork, emotions among the people close to it are running high. If you are inside this community, the stakes seem incredibly high. But it is moments like that when it is good to take a step back and assess where we are.

The marketcap of Bitcoin and Ethereum combined at the moment that I am writing this is $21 Billion and change. If you add in the 98 biggest currencies you get to $24 Billion. For comparison, the marketcap of Google alone is nearly $600 Billion. So a single centralized player today is 20x the bulk of the value of all decentralized currencies.

You could choose a different statistic and ask how many people use systems powered today by a decentralized foundation. Unfortunately it is harder to come by statistics here, but towards the end of last year there were fewer than 2 million bitcoin addresses with more than 0.1BTC  (about $100) in them. Presumably quite a few of these belong to the same people. If you go down to 0.01BTC ($10) you go up to only 5 million addresses and even at 0.001BTC ($1) you are at about 10 million addresses. Now it is likely far fewer people because many people have multiple accounts with a few BTC in them. For comparison, WeChat alone, which is largely a China phenomenon, has 700 million active users. So again there is more than an order of magnitude difference.

So yes, the fights in the decentralized world matter to those of us who are in it, but we should not forget that for now we are a pimple in the face of centralized systems around the world. And I suspect that will be the case for quite some time. Why? Because the vast bulk of endusers don’t yet care about decentralization. There are no tangible benefits for them and mostly downsides in terms of systems that are cumbersome and risky to use.

If you have been around long enough this will remind you strongly of the early days of open source and free software which goes back at least to 1983 with the GNU Project. Linux got going in 1991 and MySQL in 1995. While today open source software is widely used it was a two decade plus road to get there and closed source is still a massive business. Bitcoin by comparison is only 8 years old, Ethereum a mere 2 years.

So my expectation is that centralized systems will serve the vast bulk of users quite well for many years to come until their innovation slows down, they become too extractive and onerous in other ways and at the same time blockchain-based decentralized systems become stable and easy to use. I expect that to be at least a decade long road ahead.

To be clear: I believe that going down that road is incredibly important and at USV we plan to continue actively investing in decentralization. It is just important to develop some perspective about where we are and how long it will likely take for blockchain-based decentralized systems to have the kind of impact on the world that open source has today.


Thunderbolting Your Video Card by Jeff Atwood

When I wrote about The Golden Age of x86 Gaming, I implied that, in the future, it might be an interesting, albeit expensive, idea to upgrade your video card via an external Thunderbolt 3 enclosure.

I'm here to report that the future is now.

Yes, that's right, I paid $500 for an external Thunderbolt 3 enclosure to fit a $600 video card, all to enable a plug-in upgrade of a GPU on a Skull Canyon NUC that itself cost around $1000 fully built. I know, it sounds crazy, and … OK fine, I won't argue with you. It's crazy.

This matters mostly because of 4k, aka 2160p, aka 3840 × 2160, aka Ultra HD.

4k compared to 1080p

Plain old regular HD, aka 1080p, aka 1920 × 1080, is one quarter the size of 4k, and ¼ the work. By today's GPU standards HD is pretty much easy mode these days. It's not even interesting. No offense to console fans, or anything.

Late in 2016, I got a 4k OLED display and it … kind of blew my mind. I have never seen blacks so black, colors so vivid, on a display so thin. It made my previous 2008 era Panasonic plasma set look lame. It's so good that I'm now a little angry that every display that my eyes touch isn't OLED already. I even got into nerd fights over it, and to be honest, I'd still throw down for OLED. It is legitimately that good. Come at me, bro.

Don't believe me? Well, guess which display in the below picture is OLED? Go on, guess:

Guess which screen is OLED?

There's a reason every site that reviews TVs had to recalibrate their results when they reviewed the 2016 OLED sets.

In my extended review at Reference Home Theater, I call it “the best looking TV I’ve ever reviewed.” But we aren’t alone in loving the E6. Vincent Teoh at HDTVtest writes, “We’re not even going to qualify the following endorsement: if you can afford it, this is the TV to buy.” Rtings.com gave the E6 OLED the highest score of any TV the site has ever tested. Reviewed.com awarded it a 9.9 out of 10, with only the LG G6 OLED (which offers the same image but better styling and sound for $2,000 more) coming out ahead.

But I digress.

Playing games at 1080p in my living room was already possible. But now that I have an incredible 4k display in the living room, it's a whole other level of difficulty. Not just twice as hard – and remember current consoles barely manage to eke out 1080p at 30fps in most games – but four times as hard. That's where external GPU power comes in.

The cool technology underpinning all of this is Thunderbolt 3. The thunderbolt cable bundled with the Razer Core is rather … diminutive. There's a reason for this.

Is there a maximum cable length for Thunderbolt 3 technology?

Thunderbolt 3 passive cables have maximum lengths.

  • 0.5m TB 3 (40Gbps)
  • 1.0m TB 3 (20Gbps)
  • 2.0m TB 3 (20Gbps)

In the future we will offer active cables which will provide 40Gbps of bandwidth at longer lengths.

40Gbps is, for the record, an insane amount of bandwidth. Let's use our rule of thumb based on ultra common gigabit ethernet, that 1 gigabit = 120 megabytes/second, and we arrive at 4.8 gigabytes/second. Zow.

That's more than enough bandwidth to run even the highest of high end video cards, but it is not without overhead. There's a mild performance hit for running the card externally, on the order of 15%. There's also a further performance hit of 10% if you are in "loopback" mode on a laptop where you don't have an external display, so the video frames have to be shuttled back from the GPU to the internal laptop display.

This may look like a gamer-only thing, but surprisingly, it isn't. What you get is the general purpose ability to attach any PCI express card to any computer with a Thunderbolt 3 port and, for the most part, it just works!

Linus breaks it down and answers all your most difficult questions:

Please watch the above video closely if you're actually interested in this stuff; it is essential. I'll add some caveats of my own after working with the Razer Core for a while:

  • Make sure the video card you plan to put into the Razer Core is not too tall, or too wide. You can tell if a card is going to be too tall by looking at pictures of the mounting rear bracket. If the card extends significantly above the standard rear mounting bracket, it won't fit. If the card takes more than 2 slots in width, it also won't fit, but this is more rare. Depth (length) is rarely an issue.

  • There are four fans in the Razer Core and although it is reasonably quiet, it's not super silent or anything. You may want to mod the fans. The Razer Core is a remarkably simple device, internally, it's really just a power supply, some Thunderbolt 3 bridge logic, and a PCI express slot. I agree with Linus that the #1 area Razer could improve in the future, beyond generally getting the price down, is to use fewer and larger fans that run quieter.

  • If you're putting a heavy hitter GPU in the Razer Core, I'd try to avoid blower style cards (the ones that exhaust heat from the rear) in favor of those that cool with large fans blowing down and around the card. Dissipating 150w+ is no mean feat and you'll definitely need to keep the enclosure in open air … and of course within 0.5 meters of the computer it's connected to.

  • There is no visible external power switch on the Razer Core. It doesn't power on until you connect a TB3 cable to it. I was totally not expecting that. But once connected, it powers up and the Windows 10 Thunderbolt 3 drivers kick in and ask you to authorize the device, which I did (always authorize). Then it spun a bit, detected the new GPU, and suddenly I had multiple graphics card active on the same computer. I also installed the latest Nvidia drivers just to make sure everything was ship shape.

  • It's kinda ... weird having multiple GPUs simultaneously active. I wanted to make the Razer Core display the only display, but you can't really turn off the built in GPU – you can select "only use display 2", that's all. I got into several weird states where windows were opening on the other display and I had to mess around a fair bit to get things locked down to just one display. You may want to consider whether you have both "displays" connected for troubleshooting, or not.

And then, there I am, playing Lego Marvel in splitscreen co-op at glorious 3840 × 2160 UltraHD resolution on an amazing OLED display with my son. It is incredible.

Beyond the technical "because I could", I am wildly optimistic about the future of external Thunderbolt 3 expansion boxes, and here's why:

  • The main expense and bottleneck in any stonking gaming rig is, by far, the GPU. It's also the item you are most likely to need to replace a year or two from now.

  • The CPU and memory speeds available today are so comically fast that any device with a low-end i3-7100 for $120 will make zero difference in real world gaming at 1080p or higher … if you're OK with 30fps minimum. If you bump up to $200, you can get a quad-core i5-7500 that guarantees you 60fps minimum everywhere.

  • If you prefer a small system or a laptop, an external GPU makes it so much more flexible. Because CPU and memory speeds are already so fast, 99.9% of the time your bottleneck is the GPU, and almost any small device you can buy with a Thunderbolt 3 port can now magically transform into a potent gaming rig with a single plug. Thunderbolt 3 may be a bit cutting edge today, but more and more devices are shipping with Thunderbolt 3. Within a few years, I predict TB3 ports will be as common as USB3 ports.

  • A general purpose external PCI express enclosure will be usable for a very long time. My last seven video card upgrades were plug and play PCI Express cards that would have worked fine in any computer I've built in the last ten years.

  • External GPUs are not meaningfully bottlenecked by Thunderbolt 3 bandwidth; the impact is 15% to 25%, and perhaps even less over time as drivers and implementations mature. While Thunderbolt 3 has "only" PCI Express x4 bandwidth, many benchmarkers have noted that GPUs moving from PCI Express x16 to x8 has almost no effect on performance. And there's always Thunderbolt 4 on the horizon.

The future, as they say, is already here – it's just not evenly distributed.

I am painfully aware that costs need to come down. Way, way down. The $499 Razer Core is well made, on the vanguard of what's possible, a harbinger of the future, and fantastically enough, it does even more than what it says on the tin. But it's not exactly affordable.

I would absolutely love to see a modest, dedicated $200 external Thunderbolt 3 box that included an inexpensive current-gen GPU. This would clobber any onboard GPU on the planet. Let's compare my Skull Canyon NUC, which has Intel's fastest ever, PS4 class embedded GPU, with the modest $150 GeForce GTX 1050 Ti:

1920 × 1080 high detail
Bioshock Infinite15 → 79 fps
Rise of the Tomb Raider12 → 49 fps
Overwatch43 → 114 fps

As predicted, that's a 3x-5x stompdown. Mac users lamenting their general lack of upgradeability, hear me: this sort of box is exactly what you want and need. Imagine if Apple was to embrace upgrading their laptops and all-in-one systems via Thunderbolt 3.

I know, I know. It's a stretch. But a man can dream … of externally upgradeable GPUs. That are too expensive, sure, but they are here, right now, today. They'll only get cheaper over time.

[advertisement] Find a better job the Stack Overflow way - what you need when you need it, no spam, and no scams.


Here's our exhaustive guide to Trump's 392-word NASA budget by The Planetary Society

We break down every sentence from Trump's new NASA budget, so you don't have to


A repeat of the space shuttle's bold test flight? NASA considers crew aboard first SLS mission by The Planetary Society

NASA has only flown astronauts aboard a rocket's first flight once, when John Young and Bob Crippen took space shuttle Columbia on the boldest test flight in history. What are the risks of repeating the feat for SLS?


Unraveling a Martian enigma: The hidden rivers of Arabia Terra by The Planetary Society

Arabia Terra has always been a bit of a martian enigma. Planetary scientist Joel Davis takes us on a tour of its valley networks and their significance in telling the story of water on Mars.


Signed, sealed but not delivered: LightSail 2 awaits ship date by The Planetary Society

Following a pre-ship review at Planetary Society headquarters, LightSail 2 is ready to be integrated with its Prox-1 partner spacecraft. The final shipping schedule, however, has yet to be determined.


March 22, 2017

The light at the end of the tunnel (is not necessarily an oncoming train) by Charlie Stross

So yesterday I got to type THE END, at (oddly enough) the end of a book I've been writing since last April. "Ghost Engine" is due out in July 2018, so having a complete draft is a bit of a relief, to put it mildly. (It takes 12 months for a book to work through the production pipeline, because publishers don't publish books, they operate a workflow process that runs in lockstep across multiple books in a pipeline.) Typing THE END doesn't mean it's finished, of course. It's currently with various trusted readers for comment, and I'm probably going to have to rewrite chunks of it. However, experience suggests that most of the work is now done. My books usually expand slightly as a result of the editing after they've emerged in draft, so it's pretty much a dead certainty that this will be my second-longest delivered novel (just longer than "Accelerando", at 145,100 words, shorter than the original Merchant Princes doorstep which finally saw the light of day in its original shape as "The Bloodline Feud", at 197,800 words). (For comparison, "Dune" weighs in at 188,000 words; one paperback page is approximately 330-350 words.)

Here's the funny thing about too much work: it feels as if you're spinning your wheels and not making progress at all. This year so far, I redrafted two novels, wrote about 45,000 words of fiction, checked one set of copy edits, checked two sets of page proofs, did a bunch of promotion for a book launch, and went on a one week business trip to New York and Boston. But until I typed THE END, yesterday, it felt as if I was losing ground and not getting anything done at all. Those two words, however significant they may look, are absolutely trivial: but psychologically, being able to draw a line through a to-do item (write GHOST ENGINE) makes all the difference, and I finally feel I can relax a little.

So, what am I doing next?

Well, "Dark State" (the second Empire Games book) should be in production imminently, which means I have to check copy edits and page proofs. And by the end of this year I need to deliver a final version of "Invisible Sun", the third book in the trilogy. (It's written, but the ending needs tightening up. Not to worry, I have a plan.) I've also got a short story to write for Wild Cards because that's been on my to-do list for, oh, only a decade.

But the manic to-do list (five books in production!) that has been my constant companion and cause of sleepless nights since 2013 is finally coming to an end (three books in production, dropping to two by August) and I can finally think about new projects again for the first time in about five years. After I take the rest of this week off work to recover—time off in lieu for working over the Christmas/New Year holidays, I guess.

Here's a lesson I learned the hard way: once you're over 40, you should never commit to work-overload five years in advance. You'll be five years older, with worse health and less stamina, trying to keep up a pace dictated by your younger self. Over-work is fine—in brief doses. But as a continuous lifestyle for half a decade, it really sucks.

Meanwhile, I've got a bunch of convention travel commitments coming up this summer, including Italy, Germany, the Netherlands, Finland, and possibly even Nottingham, England! I've also got speaking gigs at the Edinburgh Science Festival and possibly the Edinburgh Book Festival, and there might just be some sort of launch event for "The Delirium Brief" in July. I'm going to put together an omnibus announcement on Friday (I'm awaiting an announcement from one of the conventions in question first).


Jenny Slate on Chris Evans. by Feeling Listless

Film On Monday, The Guardian's G2 published an astonishingly invasive interview with Jake Gyllenhaal in which the interviewer asked the actor about his relationship with Taylor Swift and then proceeded to essentially force an argument by persisting then wondering why the whole process went south. The interviewer clearly hoped to get a scoop and the actor didn't want to give him one, knowing that anything he said would enter the feedback loop on celebrity "news" websites.

For what it's worth, I think that in that kind of situation, if the actor wants to talk about their relationship they will, but there's no point pressing them on it, especially if they've otherwise given an indication of being a private person. Plus, frankly, it's probably none of our business. I'd rather hear an actor talk about the work and process and how they do their job, which is also in the interview to some extent but with less depth once Gyllenhaal is on the defensive.

If, however, they actually want to talk about their private life then that's fine, especially if it's as fascinating as this piece about Jenny Slate who recently got out of a relationship with Chris "Captain America" Evans. On the one hand, I'm slightly concerned about the extent to which his privacy is being broken here, the details of his life which are now in the public domain. But on the other she still clearly adores him and more importantly, there's nothing in here which contradicts his public image:

"Evans and Slate met at her chemistry read — the audition in which it’s determined whether two romantic leads play well together — and they instantly got along. “I remember him saying to me, ‘You’re going to be one of my closest friends.’ I was just like, ‘Man, I fucking hope this isn’t a lie, because I’m going to be devastated if this guy isn’t my friend.’ ” The first time they went out to dinner, as co-workers getting to know each other, she remembers insisting they split the bill over Evans’s strenuous objections. “If you take away my preferences, you take away my freedom,” she says she told him. “Then I was like, Oh, man, is this dude going to be like, ‘Ugh, this bra-burner.’ Instead, he was like, ‘Tell me more.’"
Of course, now he's probably going to be asked about the contents of this interview and so the feedback loop begins again.


March 21, 2017

Hadley Freeman on Love Actually. by Feeling Listless

Film After a slightly confused (slightly?) column about trans identity at the weekend (see this Twitter thread for key objections), Hadley Freeman's back re-iterating many of the same points from my old Love Actually post (which is here in case you haven't read it):

"My best friend and I went to see it on the day it opened, excited as babes on Christmas Day. We loved Notting Hill, and we adored Four Weddings and a Funeral, and sure, they were basically odes to the Oxbridge-educated, but they had charm and clever scripts. And those movies only had one plot line – this new one had nine! That meant it would be nine times as awesome, right? Wrong. We emerged from the cinema with faces frozen like Munch’s Scream, and silently went our separate ways. We called each other later to check in on one another, like victims of a terrible disaster."
Unlike Hadley I actually liked Love Actually at first. Saw it twice at the cinema, asked for the dvd at Christmas (etc etc) and it wasn't until I studied it for the dissertation I noticed how disdainful it is. Let's see if the Comic Relief sequel acts as a corrective.


March 19, 2017

My Favourite Film of 1907. by Feeling Listless



Film It’s 1907 and here’s singer Jean Noté performing the French national anthem with synchronous sound. The Jazz Singer’s generally thought of as being the first sound film, and although it’s true that it was the first to utilise a technique which was commercial repeatable, the director Georges Mendel achieving a similar effect over a decade earlier. One can well imagine if this was shown in cinemas, the audience taking to their feet and singing along. Just remarkable.



But this still isn't the earliest. William Dickson was experimenting with sound in the mid-late 1890s.


Subscriptions (feed of everything)


Updated using Planet on 25 March 2017, 05:48 AM