The Hacker’s Manual 2016 | 171
Coding
| Riak NoSQ
L
Riak offers a benchmarking tool called
Basho Bench. The graph is produced with R.
Data consistency
Data consistency in databases is critical. ACID
(Atomicity, Consistency, Isolation and Durability)
is a set of properties that guarantee that
database transactions perform reliably.
Atomicity means that when you do something
to change a database, the change should work
or fail as a whole. Isolation means that if other
things are taking place at the same time on the
same data, they should not be able to see half-
finished data. Durability refers to the guarantee
that once the user has been notified of the
success of a transaction, the transaction will
persist, and won’t be undone even if the
hardware or the software crashes afterwards.
Graph databases perform ACID transactions
by default, which is a good thing. On the other
hand, not every problem needs ‘perfect’ ACID
compliance.
MongoDB is ACID-compliant at the
single document level, but it doesn’t support
multiple-document updates that can be rolled
back. Sometimes, you may be OK with losing a
transaction or having your DB in an inconsistent
state temporarily in exchange for speed.
You should carefully check the characteristics
of a NoSQL database and decide if it fits your
needs. Nevertheless, if data consistency is
absolutely critical, you can always implement it
in code if it’s not fully supported by your NoSQL
DB. Keep in mind that this might be non-trivial
especially on distributed environments.
myRecord = bucket.get(‘myData’)
# Retrieve the record!
dictRecord = myRecord.data
# Now print it to see if all this actually
worked.
print dictRecord
$ python myRiak.py
{u’Surname’: u’Tsoukalos’, u’Name’:
u’Mihalis’}
The pb_port value of 10017 is defined in
the ./dev/dev1/etc/riak.conf file using the
listener.protobuf.internal parameter. This is
the Protocol Buffers port that is used for
connecting to the
Riak Cluster.
Due to the flexibility in the way that a
NoSQL database stores data, inserting,
querying and updating a NoSQL database is
more complex than a database that uses SQL.
Generating a Riak cluster
Creating and manipulating clusters in Riak is
relatively easy with the help of the riak-admin
command. If you try to add a node that’s not
already running to a cluster, you will fail with
the following error message:
$ dev/dev2/bin/riak-admin cluster join
[email protected]
Node is not running!
$ ./dev/dev2/bin/riak start
$ dev/dev2/bin/riak-admin cluster join
[email protected]
Success: staged join request for
‘
[email protected]’ to ‘
[email protected]’
$ dev/dev2/bin/riak-admin cluster join
[email protected]
Failed: This node is already a member of a
cluster
Similarly, if you try to join a node to itself,
you will get an error message:
$ dev/dev1/bin/riak-admin cluster join
[email protected]
Failed: This node cannot join itself in a
cluster
The following command shows the
members of an existing cluster:
$ dev/dev2/bin/riak-admin status | grep
members
ring_members : [‘
[email protected]’,’d
[email protected]’]
$ dev/dev1/bin/riak-admin status | grep
members
ring_members : [‘
[email protected]’,’d
[email protected]’]
$ dev/dev3/bin/riak-admin status | grep
members
Node is not running!
Another useful command that shows the
status of the nodes is the following:
$ ./dev/dev1/bin/riak-admin member-status
The joining status is a temporary status
and will become valid when all changes that
are waiting in a queue will be applied and
committed. If you want to force changes,
you should execute the riak-admin cluster
commit command.
If you run the riak-admin member-status
command again you will see the new status of
the dev3 node, and the riak-admin cluster
plan command displays the changes that are
about to be applied.
For a node to actually leave the cluster (
see
bottom of p169 to see what an interaction with
a cluster of five nodes looks like
), you must
first review the changes using the riak-admin
cluster plan command and then commit
them with riak-admin cluster commit.
So far, you won’t have seen any security
when interacting with a
Riak database.
Nevertheless,
Riak supports users and
passwords. You can find a lot more
information on how
Riak deals with
authentication and authorisation at
http://bit.ly/RiakDocsAuthz.
What is actually stored in the /riak/LXF
test location is what follows the -d option.
When you successfully insert a new value,
Riak will return a 204 HTTP code. As you
already know,
Riak is a key-value store,
therefore in order to retrieve a value you need
to provide a key to
Riak. You can connect to
Riak dev1 server and ask the previously stored
document by going to the http://127.0.0.1:
10018/riak/LXF/test URL. Every URL follows
the http://SERVER:PORT/riak/BUCKET/
KEY pattern. The following command returns
the list of available buckets:
$ curl -i ‘http://127.0.0.1:10018/
riak?buckets=true’
HTTP/1.1 200 OK
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.10.5
(jokes are better explained)
Date: Fri, 19 Dec 2014 21:13:37 GMT
Content-Type: application/json
Content-Length: 33
{“buckets”:[“LXF”,”linuxformat”]}
The following command returns the list of
keys in a bucket:
$ curl ‘http://127.0.0.1:10018/buckets/LXF/
keys?keys=true’
{“keys”:[“test2”,”test”,”test3”]}
Most of the times, you are going to use a
script written in a programming language to
access a
Riak database. The following is a
Python script that connects to a
Riak
database, stores and retrieves a document:
import riak
# Connect to the cluster
client = riak.RiakClient(pb_port=10017,
protocol=’pbc’)
# The name of the bucket
bucket = client.bucket(‘python’)
# “myData” is the name of the Key that will
be used
aRecord = bucket.new(‘myData’, data={
‘Name’: “Mihalis”,
‘Surname’: “Tsoukalos”
})
# Save the record
aRecord.store()
# Define the key for the record to retrieve