Monday, May 03, 2021

EtherFreakers

EtherFreakers is a decentralised game on the Ethereum blockchain. To play, an account must birth a freaker, an ERC-721 token. This requires a payment in ether to the EtherFreakers smart contract that is greater than 1.005 times the price paid for the n/2-th birthed freaker where n is the current number of freakers. For the first freaker, any price greater than zero will do.

Each freaker is randomly assigned to one of eight species: mercury, venus, mars, jupiter, saturn, uranus, neptune, and pluto. Each freaker is randomly assigned numeric values for a set of five attributes: agility, stamina, fortune, offence and defence. Both probabilty distributions are non-uniform, and the probability distribution for the attributes depends on the species.

Each freaker has energy. Before a freaker is birthed, half of the price is distributed to the existing freakers as energy and the other half is assigned directly to the new freaker as energy. Furthermore, an account can deposit or withdraw ether to increase (charge) or decrease (discharge) the energy of a freaker.

Each account has a combat multiplier with two components: offence and defense. When an account receives or sends a freaker, its combat multiplier increases or decreases respectively based on the offence and defence attributes of the freaker. A transfer also triggers a redistribution of some energy to all existing freakers.

A freaker can attack another freaker. The outcome is either a miss, a thwart or a capture. The outcome is random but is based on the attributes and energy of the freakers and the combat multiplier of the controlling accounts. An attack also triggers a redistribution of some energy to all existing freakers. In the case of a capture, the ERC-721 token of the loser is transferred to the winner. If an account has at least one freaker from each species, those freakers cannot attack or be attacked. 

Additionally, there are eight creators, a special type of token, that cannot attack or be attacked. These were created when the smart contract was first deployed and serve to siphon energy (and hence, ether) from the game.

Saturday, November 11, 2017

Bitcoin and FidoNet

I am watching the excellent BBS documentary. It contains a chapter on FidoNet: a pre-Internet worldwide computer network that linked bulletin board systems (BBSs). It is fascinating to listen to its creator and early users discuss its beginnings, growth and decline. I found the parallels between FidoNet and Bitcoin, in particular, between the founding of the International FidoNet Association (IFNA) and the founding of the Bitcoin Foundation, striking. The following are some quotes, taken directly from the documentary, that illustrate the similarities:
"Tom was the creator -- and you can look at that from a religious or a spiritual perspective. He created this thing that has a life of its own." (James Madill)
"Everyone has their own view of the "Tom Jennings' Dream"." (Frank Vest)
"The technology is easy. It's the politics that's hard." (Ryugen Fisher)
"It's just people that are trying to do their own power-plays. It's sad." (Tim Pozar)
"They're taking a really nice technology and distorting it to their own needs." (Tim Pozar)
"When you get to the point where you have people that don't know each other, then people perceive that there is an inside and an outside. And they're not inside. And things are being done to them from the inside." (Tom Jennings)
"Technically, the stuff works and if you want to spin off your own network, that's fine." (Tom Jennings)
"We felt that we needed some voice out there." (Tim Pozar)
"We needed some sort of lobbying voice or at least some voice that we could bring to government and say we're out there, this is how it works, you don't need to necessarily re-create laws for this sort of stuff." (Tim Pozar)
"The whole IFNA debate is a very sore point among a number of people." (Tom Jennings)
"Why do we need an IFNA? The answer was so that Ken and Ben could legally accept money to help support their efforts with FidoNet." (Bob Hartman)
"It just runs in spite of the idiots." (Tom Jennings)

Monday, July 08, 2013

Amazon DynamoDB, Apache Hive and Leaky Abstractions

We demonstrate two semantically equivalent HiveQL queries that use Amazon DynamoDB's storage handler but compile to MapReduce jobs with vastly different performance.

The combination of Amazon DynamoDB, Amazon Elastic MapReduce (Amazon EMR) and Apache Hive is a useful one when storing and analyzing large amounts of data. We can use Amazon DynamoDB to store time-stamped information across several applications by creating a table, myData, with the following structure:



We set the primary key type to be a hash and range. We set the hash attribute name to appId and type to number, and we set the range attribute name to ts (for a Unix timestamp) and type to number. We can store any number of items with an arbitrary set of key-value pairs that are indexed by an appId and ts. If we want to store more than one item with the same appId and ts then we can use time-based UUIDs instead of plain Unix timestamps.

In the following example we set the read and write provisioned throughputs to one unit each. We populate the table using a Node.js script. This script sequentially adds 100 items to the table. Each item contains the two required attributes: appId, which ranges from 0 to 9, and ts, which ranges from 0 to 99 for each appId. The script warns when it exceeds the level of provisioned read throughput for the table, automatically backing off and retrying to ensure that all items are properly recorded.

Now, we spin up an Amazon EMR cluster, myCluster, to perform the analyses. We chose "Hive program" for the job flow and "Start an Interactive Hive Session". We leave all other options unchanged: the master instance type is small (m1.small), the core instance count is two, the core instance type is small (m1.small) and the task instance count is zero.



When the cluster has started, we SSH into the master node and start an interactive Hive session. We connect to myData as follows:

CREATE EXTERNAL TABLE myData (appId BIGINT, ts BIGINT)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES (
  'dynamodb.table.name' = 'myData',
  'dynamodb.column.mapping' = 'appId:appId,ts:ts',
  'dynamodb.throughput.read.percent' = '1.0'
);

We can count the total number of items in the table using:

SELECT count(1) FROM myData;

This correctly returns 1000 and takes approximately 1045 seconds to complete. The bottleneck is the table's provisioned read throughput. It takes approximately 1000 seconds to read the 1000 items; the remaining time is spent compiling the HiveQL statements into MapReduce jobs and submitting the jobs to the cluster for execution. If we want to improve the query's performance we can increase the provisioned read throughput.

Suppose we only want to analyze a subset of the data in our table. Of our ten appIds, maybe only some of them are active at any one time and we only want to count the numbers of items with a specified appId. We can use the following:

SELECT count(1) FROM myData WHERE appId = 0;

This correctly returns 100 and takes approximately 143 seconds to complete. This is reasonable. The hash for the table is appId so we only need to read 100 items from the table, but of course we incur a similar penalty as before for compiling the HiveQL statements and submitting the jobs to the cluster.

What happens if we have more than one active application?

SELECT count(1) FROM myData WHERE appId = 0 OR appId = 9;

This correctly returns 200 but takes approximately 1037 seconds to complete! The query is scanning the entire table even though the attribute or column in the predicate is indexed.

What about the following query?

SELECT
  sum(value)
FROM (
  SELECT count(1) AS value FROM myData WHERE appId = 0
  UNION ALL
  SELECT count(1) AS value FROM myData WHERE appId = 9
) u;

This also returns 200 but takes approximately 327 seconds to complete. Amazon DynamoDB's storage handler (org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler) uses a query when there is only one condition on the hash in the WHERE clause. However, it uses a scan when there is more than one condition. This implementation is defined in hive-bigbird-handler.jar on the Amazon EMR cluster. Unfortunately, Amazon hasn't released the source-code for the storage handler. It is also against Amazon's rules to decompile this file (see 8.5 License Restrictions).

The HiveQL language abstracts away the procedural steps of a series of MapReduce jobs, allowing us to focus on what we want to retrieve and/or manipulate. However, this means that statements that are semantically equivalent, may compile to MapReduce jobs with vastly different performance. All course, all non-trivial abstractions are, to some degree, leaky.

Wednesday, November 30, 2011

Conway's Game of Life in the Cloud


Conway's Game of Life is a cellular automaton consisting of a 2-dimensional grid of cells each of which is either dead or alive. Each cell undergoes state transitions according to its own state and the states of its eight neighboring cells. When a cell is alive, it remains alive if two or three of its neighboring cells are also alive, otherwise it dies. When a cell is dead, it changes to alive if three of its neighboring cells are alive, otherwise it remains dead. Though an individual cell is quite simple, the aggregate behavior of many cells can be quite complex. For example, the lightweight spaceship is an example of a configuration of cells that essentially moves across the grid.

Although the Game of Life is a synchronous model it can be simulated using an asynchronous model (see, for example, Nehaniv '02 for details). The basic idea is to keep the cells nearly-synchronized at a local level by recording a local time at each cell. The system must ensure that the local times of neighboring cells differ by at most one unit of time. Each cell also records its previous state so that neighboring cells that lag behind a cell can catch-up. This removes the need for a global clock. It also removes the need for any global state  each cell can record its own state and its (most recent) view of the state of its neighboring cells.


The embedded video shows a lightweight spaceship moving across a 14x7 toroidal grid. The black, white and gray cells are alive, dead and recently-dead respectively. The interesting part is how this visualization was generated. The asynchronous transitions were performed using Storm, a distributed and fault-tolerant real-time computation system, running on Amazon Web Services. The definitions for the spout, the component that generates the lightweight spaceship in the first place, and the bolts, the components that perform the transitions and propagate the changes to the neighboring cells, were written using RedStorm. RedStorm integrates JRuby with Storm and provides a DSL for defining spouts and bolts. A subset of the changes were further propagated to Pusher using a Ruby client. A HTML5 application subscribed to these updates using Pusher's JavaScript client. It rendered the grid using a HTML5 Canvas and Backbone.js and updated the cells as necessary.

The code for the Storm topology is available on GitHub.

Tuesday, May 24, 2011

The TidySongs Hustle

The hustle:
  1. Company A sells software product for once-off fee. The product requires access to Company A's servers in order to operate.
  2. Company B acquires Company A.
  3. Company B discontinues Company A's product and shuts down their servers thereby rendering the product useless to all existing customers.
  4. Company B releases a "new" product whose functionality and interface are suspiciously similar to that of the discontinued product.
The hustlers:

In November 2010 I bought TidySongs from CloudBrain for EUR35.68. TidySongs automatically organizes your iTunes library by removing duplicates, fixing spellings, filling in blank fields and adding missing album art. It had received mostly favorable reviews. Now, when I open TidySongs I get an error stating that it cannot connect to TidySongs' server.


Their website states that TidySongs has been discontinued. They link to a product called Rinse from RealNetworks that sells for $39. It too organizes your iTunes library by removing duplicates, fixing spellings, filling in blank fields and adding missing album art. They could at least have changed the interface a little.

Thursday, March 03, 2011

Renaming Tabs in jQuery UI

jQuery UI has a very nice tab system. However, as of version 1.8.10, there isn't an easy method of dynamically renaming tabs. The main difficulty is that the tab names are stored in an unnumbered list that is parallel to the contents of the tabs. There are a number of hacks and extensions on the web but no one-liners. So here is mine:

Tuesday, March 16, 2010

Unblogging

Google Reader is more than just a feed reader. It's a cache and an archive: http://www.google.com/reader/atom/feed/FEED_URL?r=n&n=X displays the last X posts from FEED_URL even if FEED_URL itself does not contain all of its last X posts. If the author of FEED_URL deletes an old post it will not be removed from the cache. This is nothing strange, caching and archiving are part and parcel of the Internet.

However, there are a number of issues with this type of caching. Firstly, many authors try a number of "test" posts after creating a blog. If they check the posts in Google Reader, they become difficult to delete. Secondly, there is no indication when browsing Google Reader that a post may have been deleted and that the author no longer wishes the post to be public. Thirdly, there is no robots.txt mechanism to restrict caching. Fourthly, Google does not delete posts from the cache by request.

In Google's own words: "Reader caches all entries in your feed as your feed most likely only contains your most recent entries. Unfortunately, there isn't a way for Reader to tell which items have been deliberately removed from your site as opposed to having just fallen off the end of your feed. You can create a blank item with the same GUID tag as the original item to at least remove the content from Reader and other feed readers. Contact your blogging software provider for more help with this issue."

So, how do you find the GUIDs of your deleted posts? Assuming you have a Google Account and an account with Blogger, the following Ruby script will output the GUIDs of your deleted posts:



Next, you need to edit the contents of each post using http://www.blogger.com/post-edit.g?blogID=12345&postID=67890. You may also have to bring the post dates forward so that Google Reader will notice the changes.