Exporting CUCM to an Offline Local DB with Python

I use database queries in Communications Manager a good bit to generate reports for work I’m doing. Things like finding second lines on phones or creating phone book type sheets for users take just a second in a query, but the downside is typically presenting that data. After switching most of my scripting from PHP to Python I’ve found that it handles REST well and there is an xlswriter library that is great for dumping the data into an Excel sheet already nice and formatted.

One thing I have been missing from a summary report generator script I use is the ability to look back at different configurations without needing to be in that particular system or having access to it. The Excel files are great, but I find it easier sometimes to just join a couple of tables rather than needing to Google the xlookup syntax all the time, because no matter how many times I use it it never seems to stick.

About 6 months ago I ran into an awesome little python tidbit that for me made it worth switching for PHP and almost stop complaining so much about the tab v space indention issue that I hate so much. SQLLite3 is builtin. No extra packages, no worrying with a separate Postgres install for DB work etc. Just import it and go. It have started to change the way I think about the scripts I run and the ease of getting data into and out of a DB. A friend also led me onto Beekeeper which is a DB client that lets you connect to different database types and is Linux supported. So once I have the SQLLite DB I can connect to that file with Beekeeper and start querying away. Beekeeper also supports export the queries via Excel or CSV too.So my next step was figuring out how to pull the table from CUCM into my new SQLLite DB file using Python to make it all easy.

There are a couple of system tables we want to look at to make sense of it. First we need to identify the tables by name and table id. Table IDs between 1 and 100 are used for system information. This table we are looking at is number 1. Tables over 1000 are where CUCM stores CDR stuff so no real need to look there either. Everything in between is the good stuff. (Great intro article on navigating and using the CUCM Database)

admin:run sql select tabname,tabid from systables where tabid > 100 and tabid < 1000
tabname                          tabid
================================ =====
dblschemaorder                   101
aarneighborhood                  102
aardialprefixmatrix              103
availdialplan                    106

Now that we can search the systables table to get a list of all the tables and ID we can look at the syscolumns table to get information about the columns. In order to pull the DB into our own the type and lengths are important. Its all cryptic, but what we want to look at is the colname, tabid and collength rows

admin:run sql select limit 10 * from syscolumns
colname  tabid colno coltype collength colmin colmax extended_id seclabelid colattr
======== ===== ===== ======= ========= ====== ====== =========== ========== =======
tabname  1     1     16      128       NULL   NULL   0           0          0
owner    1     2     15      32        NULL   NULL   0           0          0
partnum  1     3     2       4         NULL   NULL   0           0          0
tabid    1     4     6       4         2      2009   0           0          0
rowsize  1     5     1       2         NULL   NULL   0           0          0
ncols    1     6     1       2         NULL   NULL   0           0          0
nindexes 1     7     1       2         NULL   NULL   0           0          0
nrows    1     8     3       8         NULL   NULL   0           0          0
created  1     9     7       4         NULL   NULL   0           0          0
version  1     10    2       4         NULL   NULL   0           0          0       

Now that we have these two, a quick join helps us out to make sense of it.

admin:run sql select limit 10 tabname,colname,coltype,collength from syscolumns inner join systables on systables.tabid=syscolumns.tabid where tabname='device'
tabname colname                coltype collength
======= ====================== ======= =========
device  pkid                   256     36
device  name                   269     129
device  description            296     512
device  tkmodel                258     4
device  tkdeviceprotocol       258     4
device  tkprotocolside         258     4
device  specialloadinformation 13      32
device  fkdevicepool           0       36
device  fkphonetemplate        0       36
device  fkcallingsearchspace   0       36    

The next bit of “this is way to hard” comes courtesy from IBM. The mysterious coltype value is a reference you have to look up from here.
Now for the next bit of crazy. The table in that link clearly only goes to 53 and the type in CM are mostly above 250. The hexadecimal values make a little more sense. Basically in Hex format the last 2 digits are in that table. The next digit says if the column is nullable or not. So easy way is to subtract 256 if needed and then look it up again. In the case of CHAR and VARCHAR values the collength column tells us how long we can set the value.

Now that we have the table columns and types we can assume that we want to set the primary key on either pkid or enum field for quick parsing. Because we aren’t in production with this DB or anything like that we won’t mess with other constraints etc. We’re just looking.

So now we stick it together with a little Python.

The first function is a dirty little SQL statement through AXL REST. We just use the XML interface so we don’t need to mess with SOAP and ZEEP etc. This will run the query and output the results in a nice list for us to parse through. Each row is a new list and each column is represented as the field of a dictionary. When running large queries like “SELECT * FROM device” you will end up hitting a wall on how much AXL will send you back in 1 pass. Luckily it is nice enough for the response to include the total number of rows and how many you can request at a time, so if we receive a 500 error we want to parse out the body and see if it is the throttling. Then we can go back and loop through the table until we get it all.

## Function to get SQL response and return array formatted data
def axl_query(sqlStatement):
  #initialize variables
  resultDict = []
  rowInfo = ["",None,None]
  execSQLStatement = sqlStatement

  # loop through the query until there is no more to get
  while rowInfo[1] is None or int(rowInfo[1]) > len(resultDict):
    # if the first pass sees the need to loop through then set the max for the query and the page number as part of the new sql query
    if rowInfo[1] is not None and int(rowInfo[1]) > 0:
      execSQLStatement = re.sub("[sS][eE][lL][eE][cC][tT] ",f"SELECT SKIP {len(resultDict)} LIMIT {rowInfo[2]} ",sqlStatement)

    #generate the XML payload
    payload = f"""<?xml version=\"1.0\" encoding=\"UTF-8\"?>
            <soapenv:Envelope xmlns:soapenv=\"\" xmlns:ns=\"\">

    #set REST headers
    headers = {
      'Content-Type': 'text/xml'
    #send REST query and store as response
    response = requests.request("POST", f'https://{args.ip}:8443/axl/', auth = HTTPBasicAuth(args.u,args.p), headers=headers, data = payload, verify=False)

    #If request was good then return response values
    if response.status_code == 200:

      #normalize output to always show array of rows
      if xmltodict.parse(response.text)['soapenv:Envelope']['soapenv:Body']['ns:executeSQLQueryResponse']['return'] is not None:
        if type(xmltodict.parse(response.text)['soapenv:Envelope']['soapenv:Body']['ns:executeSQLQueryResponse']['return']['row']) is not list:
          resultDict.extend([xmltodict.parse(response.text, dict_constructor=dict)['soapenv:Envelope']['soapenv:Body']['ns:executeSQLQueryResponse']['return']['row']])
          resultDict.extend(xmltodict.parse(response.text, dict_constructor=dict)['soapenv:Envelope']['soapenv:Body']['ns:executeSQLQueryResponse']['return']['row'])

      # if the info is all good break out of the loop.  No paging required
      if rowInfo[1] is None or int(rowInfo[1]) <= len(resultDict):

    # match on throttled response
    elif"Query request too large. Total rows matched: ",response.text):
      rowInfo = re.sub("[^\d:]",'',xmltodict.parse(response.text, dict_constructor=dict)['soapenv:Envelope']['soapenv:Body']['soapenv:Fault']['faultstring']).split(":")
      print(f"Multiple queries needed for {rowInfo[1]} rows.")

    #if error leave
      print(f'Requests Error: {response}')
      resultDict = False

  return resultDict;

Our first pass on creating the table is based on getting the table format so we can build our SQLLite DB. Here I just used a dictionary to store our column types. Since we don’t need everything as a 1:1 you’ll see some values that are simplified. We also want to subtract 256 where needed and also set pkid of enum as the primary key when possible.

#get table columns and types from CM/INFORMIX and create sqllite3 db
def create_table(tableName):
  # map column types
  dbColTypeMap = {'0'  :'CHAR',
                '1'  :'SMALLINT',
                '2'  :'INTEGER',
                '3'  :'FLOAT',
                '4'  :'SMALLFLOAT',
                '5'  :'DECIMAL',
                '6'  :'INTEGER', #Serial 1
                '7'  :'DATE',
                '8'  :'MONEY',
                '9'  :'NULL',
                '10' :'DATETIME',
                '11' :'BYTE',
                '12' :'TEXT',
                '13' :'VARCHAR',
                '14' :'INTERVAL',
                '15' :'NCHAR',
                '16' :'NVARCHAR',
                '17' :'INT',
                '18' :'SERIAL',
                '19' :'SET',
                '20' :'MULTISET',
                '21' :'LIST',
                '22' :'ROW',
                '23' :'COLLECTION',
                '40' :'LVARCHAR',
                '41' :'BOOLEAN',
                '43' :'LVARCHAR',
                '45' :'BOOLEAN',
                '52' :'BIGINT',
                '53' :'BIGSERIAL'
  # query to get columns and types
  deviceTable = axl_query(f"SELECT colname,coltype,collength FROM syscolumns  INNER JOIN systables ON systables.tabid=syscolumns.tabid  WHERE systables.tabname = '{tableName}'")

  #start building the SQL string to create the new table
  print(f"CREATE TABLE {tableName}")
  sqlCreateTable = f"CREATE TABLE {tableName} ("

  #loop through columns
  for col in deviceTable:
    #correct "NOT NULL" column types
    if int(col['coltype']) >= 256:
      col['coltype'] = str(int(col['coltype']) -256 )
    #if needed add the length on the columns (CHAR/VARCHAR)
    if col['coltype'] in ['0','13','15','16','40','43'] :
      sqlCreateTable = sqlCreateTable + f"{col['colname']} {dbColTypeMap[col['coltype']]}({col['collength']})"
      sqlCreateTable = sqlCreateTable + f"{col['colname']} {dbColTypeMap[col['coltype']]}"
    #set the primary key if possible
    if col['colname'] in ['pkid','enum']:
      sqlCreateTable = sqlCreateTable + " PRIMARY KEY"
    sqlCreateTable = sqlCreateTable + ", "

  sqlCreateTable = sqlCreateTable[:-2] + ")"

    #create the new table in the DB
  except Exception as err:
    print(f"Error creating table: {err}")
    return False
    return True

Now we have a built DB and we can load the data to it. We’ll use the same sql function, but this time loop through the results to insert.

Load Table in DB with values from Dict generated from select * on table
def load_table(tableName):
    #get all info from the table
    tableDict = axl_query(f"SELECT * FROM {tableName}")
  except Exception as err:
    print(f"Error loading table: {err}")
    return False

    #loop through each row in the results and build a SQL string to insert
    for row in tableDict:
      rowValues = "("
      firstLoop = True
      for item in row.values():
        item = str(item).replace("'", '')
        if firstLoop:
          rowValues = f"{rowValues}'{item}'"
          rowValues = f"{rowValues},'{item}'"
        firstLoop = False
      rowValues = rowValues + ")"

      sql = f"INSERT INTO {tableName} ({','.join(tableDict[0].keys())}) VALUES {rowValues};"


    return True

Now at the end we have a new DB file. We can connect to the file through Beekeeper for looking around or open it back up with Python etc to parse out data.