Embracing piracy

I had an idea yesterday and it seemed so good on the surface that I immediately assumed one or more of the following were true:

  • It was trivial and probably already done to death.
  • My reasoning on the subject is flawed, incomplete or unreasonable.

This happens very often to me as I think about new ideas a lot (it’s sort of my hobby), way beyond my area of expertise. It’s fun and I find the process interesting but by definition I’m looking for diamonds by digging randomly. So it’s important to not get overly enthusiastic about everything that seems to sparkle at first glance.

Anyway, I did some googling today and I did not quite find this exact idea, so here goes. Trigger warning: the idea includes the word ‘blockchain’, which is a buzzword so will make you want to groan. Stay with me for five minutes and see what you think, though. I’d appreciate it.

The industry seems to be trying to “solve” piracy with the blockchain. There are some ideas here, some more promising than others, but none of those seem to truly get it.

I believe we should be using the blockchain to embrace and improve piracy; not try to fight it.

Have you pirated a movie recently? I live in a country where piracy is essentially legal as long as you don’t seed, so I’m privileged in that way (that and many others). Here’s what it looks like for me: you look for a movie or a series in Pirate Bay, which is usually up even though it’s probably one of the most resourcefully hated sites on the internet (on the basis of dollars belonging to people that hate it), then click on its magnet link. You then wait a minute (with my internet connection) and have a copy of the movie in question. It has the following characteristics, most of them probabilistic:

  • It might or might not be good quality. But a lot of the recent stuff is essentially perfect quality; as good as Netflix, with no buffering ever once the download is over.
  • It might or might not have built-in subtitles, which are nice. I’d say about 50% of recent series and films come in a container that supports subtitles, like Matroska, and include at least English language subtitles, which are useful enough for English as a Second Language Speakers (and there’s a lot of them in the internet). Note that these subtitles are very often fan-made, at significant effort by random people around the internet. When something doesn’t have built-in subtitles, you can use a subtitle downloader that just finds the right subtitle for you in one click or command line – most of the time, anyway.

Honestly, I’d be fine just pirating everything with this procedure. It’s easy, and it can be made easier still (you can use something like Popcorn Time to make the whole process look even more like Netflix). The only issue is that it’s pirating. So the people that made an effort in producing first the movie, then the digital file, don’t get rewarded for it:

  • The producers don’t get money from me.
  • All the other people involved in the making of the actual movie probably don’t get that extra bit of money from the producers. This includes the most visible, like actors and authors and directors.
  • Piraters don’t get money from me. This includes the subtitlers and encoders whose work I also get to enjoy.

So here I am thinking: if I could watch a movie, determine how much I liked it, and then pay one to ten bucks for it, I’d do it. I swear I would – if the money went to all the parties that were involved in the making of the movie and the file that contains it, in a way I would find fair.

See where this is going? Matroska could somehow embed the wallet addresses of the people involved in making the movie. The wallets should probably be in a tree-based structure, containing also a set of rules for the distribution of funds from each node to their children. People could keep things easy by sending money to the root node, and accepting the distribution rules set in the file; or fine-tune and send the money directly to the nodes that they care about the most.

Of course the devil is usually in the details: how do we validate this tree and its addresses, allowing updates but not letting the system be taken over by scammers? Well, you know the semi-answer already. It’s the buzzword.

The participants should vote in some way and reach a consensus. Different groups of people could even converge on different solutions: lawful people would probably always compensate the studios, and go through the existing wealth distribution system, while other groups might want to mostly give their money to authors and performers.

Once this system is in place, I see no reason why it should be restricted to video files. Every digital file could be in scope here. I find a picture in GIS, find it useful for my purposes, download it an use it. I could send a buck or two to whoever bothered to claim ownership for it.

Once this system is in place, every person and company has an interest in being part of the system. Why not sign all your files and put your wallet address in them? It can’t hurt, and most modern files include metadata anyway. The process of embedding wallet information in every file you touch could be made quite straightforward.

So, there goes. It’s not a complex idea. Surely someone is building this.

Where are they?

Cover

If Flancia were a book, I would like the following to be its cover: a silhouette of a person, head and torso. It is a stylized drawing.

A small garden is growing off the person’s head. One can notice a tall lotus.

Music as meditation

The benefits of meditation on well being are widely studied. It seems natural to me to think that music should have many of the same benefits that meditation has. Both listening and making it, if done with the right kind of single-minded absorption that characterizes meditation. Whereas in meditation the focus is on the self, in music the point of focus is a set of musical ideas; or the succession in time of individual musical moments.

Meditation opens up the door to meta-thinking, as it teaches oneself to clear one’s mind and in doing so makes it easier to realize that one is constantly thinking; it’s easy to just live life from one thought to the next, without exercising much control on that flow; never (or not often enough) stopping and ask why, or doing any other kind of meta-thinking.

Music probably opens up its own set of ideas, those motivated by the underlying structures (elements from music theory that are perceived aurally and enjoyed or just noticed); like a smell that evokes a memory, music at primes parts of our brain to be stimulated more easily later, and thus used in other contexts.

The random seller

A short story set in the future about a seller of randomness. Randomness is highly valued because it’s used for any useful work, and its production is heavily regulated.

People get together to try to produce randomness by throwing dice as it’s the best free way available.

“I need to go buy something random”.

Drafts

I’ve got plenty of drafts that I never finish, so I thought I’d just write something and post it right away for a change.

I guess my draft problem is an editing problem. I never sit down to edit – it’s not something that comes naturally to me. I like to write down quick ideas, I guess because writing something down gets me a quick fix. Polishing – my writing, a piece of code, my shoes – has never been my forte. It feels like actual work.

Just after you write something you get to feel sort of OK about all the mistakes that are in it. “It’s just a quick draft, of course it’s clunky”. Excuses like this grow thinner once you’ve actually made an effort at something.

Two feature requests for Google Keep

I wrote this a few months ago, but never got round to editing and publishing. Now I found myself on a long flight with some time to spare, so here goes.

A rant

Google Keep is frustrating — on Chrome sometimes I go to the Keep tab I keep pinned, and it turns out that I’m in a view that doesn’t let me take any single action to get to a “create note” flow. Usually it happens when I’m scrolled somewhere into my stack of notes, or when I have a note I’m already working on. This feels, frankly, very cumbersome to me. With the disclaimer that I don’t know much about UI/UX, and great people work at Google in these fields: I think this is probably not the right UI for this tool. When I get to Keep I’m usually trying to get what I want quickly, and that is one of two actions:

  • Find (to read or edit) a note I’ve written.
  • Write a new note.

Usually I’m in a rush: I’m in the middle of something and I need a note I’ve written in the past to finish the task; or I have just thought of something or found something and I want to write it down lest I forget (a number, a to-do, a random thought – I’m a very forgetful person). I want Keep to be the tool that I can use for organizing my life. It shouldn’t get in the way.

Unfortunately, Keep fails at both flows listed above in at least some ways. First, what I’ve already mentioned – in several contexts, there’s no simple way to get to one of the actions I want to take. I’d rather not deal with extra actions like scrolling and searching for the ‘Take a note’ text area, or pressing escape to exit ‘Search’ when I’ve left Keep on the Search screen, or clicking on different UI elements to do the same. I think ‘Search’ and ‘Write’ should always be available, in every view, in a consistent way. Different always-present buttons (say a ‘+’ to compose, like Gmail does?) seem superior to me.

On top of that, the mobile and web versions are inconsistent; whereas in web ‘Take a note’ eventually shows up at the top of the UI, on mobile it’s at the bottom. They should work the same way in this respect. Perhaps mobile is closer to the right UI here, as at least ‘Search’ and ‘Take a note’ are in clearly distinct places in the UI.

Details all the way down

I say this all not to shame the Keep designers and engineers, which I’m sure are brilliant. Designing UIs and writing software is hard.

Let’s assume for a minute that my gripes are representative of a significant fraction of the user base; if it’s not mine that are representative, there are others like them. The designers and engineers pretty likely have a good feeling for what’s wrong with their app, and in the back of their mind they dream of the time when they will be able to just go ahead and fix it. But the moment keeps slipping – doing anything is hard, harder than you think, and any UX or architectural change probably takes them not just 10x but perhaps 100x of the effort they feel it should take. I know this is how I feel at my own job, at least.

Most software fixes are easy conceptually, but hard in practice. There’s details everywhere that need to be taken care of; details all the way down. Change something into something else. Fix the callers. Fix the tests. Perhaps you need a refactor – that’s fine, our tools nowadays can help there. But they can’t help with writing most of the actual new logic. Or with shipping a larger scale change (something that changes the server architecture) safely. Some day they may – if that ever happens, programmers will be able to focus on different things. And software may be noticeably better.

The wild part

Now for the wild part: I’m not an expert in tooling, but ML-enabled advances in software development tools could make some or most of the steps involved in shipping software changes automatable – or at least assisted. I’m sure researchers are working on some aspects of these hypothetical toolchain that will get us there; and thinking about the others. I don’t have any particular insight into the problem; but I wanted to think a bit about how some of the steps in this process could work, and what it would mean. What it would feel like.

Some wishful thinking: what would happen if ML could tackle parts of what we currently call programming? First of all: programming would probably be redefined, as it usually happens. It may end up being redefined many times as as ML iterates and infringes on this field of human thought, and human activity, the same as elsewhere. Eventually programming may cease to be about C++ or Java, and become more and about reporting the right bugs (in the right format) and sending the right feature requests to some fancy reinforcement learning coder-agent. It will then do the “mindless” thing — write the new method, fix the tests, submit the change and go through the process to shepherd the change through the release process all the way to production. Perhaps even monitor how well it works once it ships. It won’t do everything in the previous list right away, of course, but even if it only helped here and there it would add value; these may turn out to be iterative improvements that happen as we progress in this field. I’m not sure about anybody else, but I sort of am looking forward to all of this, honestly.

How could a next generation code writing assistant look like? One idea might be to augment test-based development; you perhaps write the function signature, then you write tests for it, and of course assert what valid input looks like. Sounds familiar? Expected outputs in unit tests sound like a kind of labelling. A generative model (similar to GPT-2) could presumably be trained on a huge amount of code, and potentially learn the utterances that are most likely to yield the expected output. A programmer could probably look at failed solutions and give feedback on high level issues to be fixed, or mutations to be tried. For example: indicate that more code involving a variable should be written (that the programmer can see needs to be transformed in some way). Or, perhaps, add some intermediate logic that the programmer knows should happen eventually: do something for every element in an array; or define a variable with some descriptive name, as a hint that leads the model along the right path.

Anyway, I’ve added a to-do to my Google Keep list to investigate what the researchers are up to in ML-based code-generation/change assistance. As usual I’m writing mostly naively, and these ideas are very likely very old. But I find writing to be a good way to realize what I’m interested in – what I clearly don’t know, but would like to know.

Flying

I’m really enjoying “Hands-On Machine Learning with Scikit-Learn and Tensorflow” by Aurélien Géron. It doesn’t sound like a page turner immediately, I know, but I’ve been having great fun just reading it cover to cover in this long flight I’m on. I needed a book that gave me a high level overview of the whole field of Machine Learning, and this is it. It was recommended in the Machine Learning podcast I listened to a few months ago.

The first part of the book covers basics and Scikit-Learn – no deep learning in it until page 230. I had heard in several places that it was not a good idea to skip to deep learning even if you think you’re going to end up using deep learning for your models (I think Andrew Ng also mentions this many times), and I can see why; there’s many interesting “shallow” algorithms, and this book covers interesting theory while discussing them. Scikit-Learn also provides a lot of useful goodies that are likely to be used even if you’re mostly using Tensorflow: utility functions, and of course simpler ways of getting shallow models working. I particularly liked the way in which Scikit-Learn lets you set up “pipelines” of transformations and trainers. Finally, Scikit-Learn has great support for decision trees – and it turns out that decision trees are state of the art for many problems, in practice, and have the advantage of yielding explainable (“white box”) models, so there’s that too. I read that Tensorflow supports the Scikit-Learn API, but at this point I’m not sure what that means and I can’t check as I have no internet connection currently on this flight I’m on. I hope you are able to train the whole range of shallow models through it, straight in Tensorflow, as it’d be awkward/annoying to have to set up different systems to train shallow and deep models.

Anyway, I’m now officially in the Tensorflow part of the book and I’m also happy about that. At work I just got to the stage in which I am ready to actually go out and train a model for my first ML-related project, so reading more about Tensorflow in preparation for that has been an exciting way of spending the long flight. Some of it I had already used, but reviewing is how I learn. I’m using the first edition of the book, not the fancy new one that’s about to come out and covers Tensorflow 2, but I think I made the right call by getting the “old” one (from 2016) instead of waiting for the new edition that I knew was coming out. Sure, some parts are likely outdated (it mentions Tensorflow 1.4 as experimental), but the book is working well for me as it is. The background I’m getting should come in handy for my project; this information wouldn’t have been as useful if it had come to me in six months. I’ll probably use Tensorflow 1.7 for my first TF project anyway, so there’s that too. Having said that, I like the book enough that I may get the second edition depending on the reviews it gets (and exactly what changes in it). Reading the updated version would be yet another way of reviewing.

9900 hours to go

Whoa, there go 15 days without posting. It’s funny, how many times have I run into random abandoned blogs where the last few posts begin like this post? I wonder if I’ll quit anytime soon. I don’t feel like quitting really, I’ve just been busy with other things. Let’s wait and see :)

I wrote several things that I thought of posting, then didn’t because I felt they needed work. It’s funny – I don’t really have any regular readers, I think, so it doesn’t make an immediate difference whether I post what I write straight away (and perhaps edit it live) or postpone publishing until I’ve “polished” it (which realistically may never happen).

I feel like I mostly write for archive.org; what I write could end up persisting there, and might be read many years from now. This train of thought led me to write one of the pieces I mentioned. I will probably clean it up a bit and post it after finishing this entry (but then again, maybe not).

Going back to the 15 day long hiatus: I’ve been busy with several things related to Machine Learning – and, to put it succinctly, that makes me happy.

I have this fuzzy long-term plan to learn it well, and a more tactical approach that consists in just keeping an eye out for reasonable opportunities to apply what I’ve learnt hands-on. Now I’ve finally found a project at work that could benefit from ML, so I’ve put aside some of my time to experiment with it. Everything is taking long, as it’s usually the case with programming (for me, anyway), but I really enjoy the process so for once I don’t really mind. It’s such a nice change; I don’t really feel this way about my day job, usually. I don’t think I’ve felt this way about something technical since I was in university.

I’ve also read papers that I found stimulating:

I’ve probably spent 35h over the last three weeks doing ML, and if I remember correctly I’ve enjoyed all of it. Overall I reckon I’ve probably put 100 hours into learning ML in all formats (Coursera and podcasts included) since I started.

Going by the now contended 10000 hours rule of thumb – I have 9900 hours to go.