|
Gadfly Server Operations |
For example to startup a server for the test database "test" in directory "dbtest" (created by gftest.py) use:
python gfserve.py 2222 test dbtest adminor to start up the same server with some non-priviledged policies and some named prepared queries (as initialized in gfstest.startup(...)) use
python gfserve.py 2222 test dbtest admin gfstestIn both cases the admin password for the server is "admin" and the server runs on port 2222. See the doc string for gfserve.py for more information on the command line arguments.
Only one process should directly access a gadfly database at once (not mediated by a server), so if a server is running, no other server for that database should be started and no other process should connect in "non-server" mode to that database.
For example I imagine that simple Gadfly servers may be of use to implement database enabled CGI scripts, whereas the "non-server" Gadfly will only run with CGI scripts that do not modify the database, and the startup time for Gadfly might make those scripts unacceptibly slow if the database is large. Furthermore, by using the security features a Gadfly server could be configured to allow restricted data distribution across a network without compromising the integrity of the database.
Security is arbitrated by policies. Policies have passwords that are never transmitted in clear text. However, a "captured" command could potentially be repeated by a hostile program even without knowing the password. It is not a good idea to run admin or other unrestricted commands on a network that may have hostile parties sniffing the network. As with the rest of the system I provide no guarantees, but for many purposes the level of security provided may be acceptible. To be specific passwords are used to generate md5 certificates for all server accesses (please see gfsocket.py for implementation details).
A server always has an "admin" policy that is permitted to shutdown, restart, or force a checkpoint on the server. By default the admin policy also has the ability to run arbitrary SQL statements such as "drop table x". This ability can be disabled in a startup function if needed.
admin_policy.general_queries=0
Other policies can be created that have very restricted access. For example the following startup function initializes two policies beyond the admin policy that can only access certain tables in specific ways (from gfstest.py):
def startup(admin_policy, connection, Server_instance): """example startup script. add a policies test and test1 passwords same test1 is allowed to query the likess table by name test is allowed to update likes where drinker='nan' also add prepared query dumpwork to admin_policy. """ from gfserve import Policy admin_policy["dumpwork"] = "select * from work" test1 = Policy("test1", "test1", connection, queries=0) test = Policy("test", "test", connection, queries=0) test1["qlike"] = "select * from likes where drinker=?" test["updatenan"] = """ update likes set beer=?, perday=? where drinker='nan' """ test["getnan"] = """ select * from likes where drinker='nan' """ return {"test": test, "test1": test1}Please see the doc string for gfserve.py for more information on creating startup functions.
A policy with queries disabled (queries=0) can only execute named queries. By using such policies a DBA can configure a server such that client programs can only read certain tables, can only update certain rows of certain tables in certain ways, and so forth.
Even policies with "unrestricted access" (queries=1) can provide performance benefits if they have associated named, prepared queries (like "dumpwork" above). At the moment the SQL parser slows down gadfly a bit, and prepared queries will only be parsed once for all clients. After the first access subsequent accesses may be noticably faster (10x faster in some cases), especially if the server has the kjbuckets builtin C module. However, with queries=1 the policy can execute any SQL statement.
NOTE: The server runs all accesses (once the complete message has been read from the network) serially -- there is no concurrent access permitted to a Gadfly instance at this time. For this reason a "large query" may cause the server to "freeze" and delay other accesses. Incomplete requests due to network delays or other problems will not freeze the server, however (sockets are polled using select.select).
NOTE: All server accesses run in "autocommit mode" at this time. A successful access automatically triggers a database commit (but an unsuccessful access will rollback).
As an optimization, however, checkpoints only occur occasionally, once per a given number of accesses, configurable by setting:
Server_instance.check_loop = 100
python gfclient.py shutdown 2222 adminSee the doc strings and source for gfserve.py and gfstest.py for more information.
To access a gadfly server from a remote machine the only python modules required (in addition to the standard libraries) are gfclient.py and gfsocket.py.
Initialize a connection with a given "POLICY" with "PASSWORD" to a running server on "machine.domain.com" using port number 2222 with:
from gfclient import gfclient conn = gfclient("POLICY", 2222, "PASSWORD", "machine.domain.com")Note that policy names and passwords are case sensitive.
Queries and other statements are normally executed via cursors. Obtain a cursor from a connection using:
cursor = connection.cursor()Execute a statement in a cursor using:
cursor.execute(statement)or to provide dynamic parameters:
cursor.execute(statement, dynamic_parameters)For example
cursor.execute("select * from work") ... cursor.execute("select * from work where name=?", ("carla",))The dynamic parameters work the same as described in the the main gadfly documentation page. In particular INSERT VALUES can insert several rows at once by using a list of tuples for the rows.
If there is any problem (bad policy name, bad password, server not running, queries not allowed for this policy) the execute will generate an exception.
To run a named/prepared query (initialized at startup) use execute_prepared, which takes a prepared statement name rather than a query string:
cursor.execute_prepared("updatenan", ("rollingrock", 1)) ... cursor.execute_prepared("getnan")The execute_prepared method works just like the execute method except that the "name" must be the name of a query initialized by the startup(...) function at server startup.
NOTE: by default any execution that sends or recieves "too much
data" will be aborted. Edit gfsocket.py (both on the client end
and on the server end if different) if you wish to disable this
sanity check feature.
LEN_LIMIT=10e8
As with other dbapi cursors the results of a query can be extracted as a list of tuples using (after execute):
result_list = cursor.fetchall()The other fetches (fetchone and fetchmany) have not been implemented yet (partially since they don't make much sense in this context).
Both named and unnamed statements may be semicolon separated sequences of several SQL statements, but if they are they will return no results.
Please note: Although I have attempted to provide a robust implementation for this software I do not guarantee its correctness. I hope it will work well for you but I do not assume any legal responsibility for problems anyone may have during use of these programs.