Monday, July 29, 2013

DBIR 2013 – The Breaches Keep Coming

Verizon has recently published their 2013 Data Breach Investigation Report (DBIR) covering incidents that occurred in 2012. We’ve all seen the headlines that show all too clearly that security breaches continue:


  • 22 million logons stolen from Yahoo Japan
  • $45 million stolen in complex ATM heist from two middle eastern banks
  • And just this week, $300 million in losses from the theft of 160 million payment cards in extremely well organised, multi-year fraud

  • The Verizon DBIR underscores those headlines with a lot of hard data, gathered from 47,000 security incidents during 2012. Over the next few weeks we’re going to take a good look at the DBIR, and see what sort of conclusions we can draw from it that apply to NonStop users – what applications and data are at risk, from what sorts of attacks, and what can be done to protect those valuable assets.

    Here are some facts from the 2013 DBIR to get you started:


  • 621 confirmed data breaches studied in detail
  • 19 contributors, including government agencies, private security organizations and consulting companies
  • 44 million records compromised
  • The largest and most comprehensive data breach study performed each year
  • 75% of attacks were opportunistic – not targeted at a specific individual or company – with the majority of those financially motivated
  • 37% of breaches affected financial institutions

  • The fully report is available here:
    http://www.verizonenterprise.com/DBIR/2013/

    In the next blog we’ll take a look at the trends that become clear from this data, and what you can learn for your organization to be best prepared to defend against these attacks. In the third instalment we’ll look at some NonStop-specific recommendations that can help in your shop, and finally we’ll wrap up with some thoughts on XYPRO products and services that are relevant to the study.

    What do you think – have you read the DBIR? How relevant is it to your organization and your role? Let us know by emailing me atandrew.price@xypro.com.

    Monday, July 22, 2013

    XYPRO NonStop Security Fundamentals Top 10 List – #10

    Because high-availability and fault-tolerant systems need strong security

    Does it make sense to have high-availability and fault-tolerance without strong security? We at XYPRO don’t think so. We recognize that companies run their most important business applications and processes on the NonStop server platform and keeping those assets safe from data loss, tampering and inadvertent harm is mission critical.

    XYPRO has been providing NonStop security solutions for over 30 years—we’ve literally written the books on NonStop security—and we’ve assembled an informal “Top 10” list of NonStop security fundamentals. Over the next couple months, we’ll count down our list of Top 10 NonStop security fundamentals—your discussion, feedback and debate are welcome. Here’s #10 on our list.

    #10: Secure the default system access settings

    To facilitate initial configuration and set-up, HP NonStop servers come with a number of default security settings. To have a well-protected NonStop system many of these default settings need to be addressed.

    Protect or Delete NULL.NULL. NonStop servers are shipped with the default userid NULL.NULL (0,0). NULL.NULL is an out-of-the-box userid that is not password protected and gives non-privileged system access. With unprotected NULL.NULL, there is a risk that unauthorized users will be able to gain access to the system and explore system settings, users and files and potentially discover and exploit system vulnerabilities. To protect the system, the NULL.NULL userid should be deleted or, if that’s not possible, the risk should be mitigated by renaming the 0,0 userid to something other than “NULL.NULL”, assigning a strong password, and expiring or “freezing” the 0,0 userid so that it can’t be used to logon to the system.

    Remove compilers from production systems. Compilers are dangerous because code can be inserted or deleted to circumvent previously implemented controls. Additionally, language compilers might be used to develop test or hacking programs to access sensitive data. To protect applications from inadvertent or malicious changes or outages, compilers and related utilities should be removed or very tightly locked down on secure systems.

    Configure Safeguard auditing in order to meet PCI requirements.The Payment Card Industry Data Security Standard (PCI DSS) is an important industry security standard developed to protect sensitive cardholder data and a key requirement for PCI DSS compliance is to “track and monitor all access to network resources and cardholder data”. Within NonStop, the Safeguard utility on NonStop provides the capability to monitor and audit security-related events. While some Safeguard events are always audited, most need to be configured to enable auditing. Properly configuring Safeguard to audit all PCI DSS-related security events is an important step in setting up a new NonStop system (or in ensuring PCI compliance for an existing system).

    Add and configure Safeguard security groups. There are six valid Safeguard security groups but they do not exist on the shipped system and must be added. Using these security groups, specific users can be delegated the authority to execute certain restricted Safeguard commands. Until these groups are created, the restricted commands can be executed by any SUPER group member.

    Add and configure Safeguard OBJECTTYPE records. Safeguard uses OBJECTTYPEs to control who can create protection records for a particular type of object or device. Without OBJECTTYPE records, any local member of the SUPER group can add a protection record for an object or device name and thereby gain control of that object or device. To protect objects and reduce possibility of misuse, add all the necessary OBJECTTYPEs and assign these to a non-super group security administrator.

    Secure sensitive objects. As shipped, there are several sensitive objects in Guardian that must be protected: TANDUMP, DIVER, USERID, and USERIDAK. Each of these objects has power capabilities within Guardian and should be secured to have SUPER only access.

    To follow along with the rest of this blog series on the NonStop Security Fundamentals Top 10 List go to blog.xypro.com.

    More in-depth information and guidance on these security subjects are available in XYPRO’s NonStop security handbooks: HP NonStop Server Security: A Practical Handbook and Securing HP NonStop Servers in an Open Systems World: TCP/IP, OSS and SQL.

    You may also contact XYPRO for assistance. For over 30 years, XYPRO has provided NonStop security solutions and services that help companies protect their NonStop systems and comply with industry regulations (such as PCI DSS, HIPAA, and SOX).

    Friday, July 19, 2013

    Does my NonStop database need maintenance?

    The short answer is yes.

    When was the last time you went to your local library? Did you see people working there, cataloging and shelving books? Where there others dropping off and checking out materials? How many items do you think your library branch carries? 10,000? 100,000? More? Consider the database on your NonStop server and think of how many entries are stored within. Numbers like 100,000 rows are where much of our data starts and our customers are a lot more aggressive and much more precise than even the most critical librarian. Your NonStop database dwarves most libraries and as such requires much more detailed attention than any library.

    As any library grows, so does the complexity of the items it stores. If you have a collection of books at home, it is likely that you don’t really need any sort of structure to manage them. You could put them in a box or a room and be able to find any single item with relative ease simply by searching the entire collection when you need something. If your collection is any larger, you need a way to find anything. Libraries do this by assigning each item a number and then organizing the material by that number. In this way, as the collection changes size, the elements within it continue to be located in relatively the same logical location. If the number of items increases, the library adds shelves and they insert new items between the existing ones. When things get checked out, removed or changed, the library staff manually reorganizes the existing material on an ongoing basis.

    All NonStop database data is store in structured files possibly distributed across many disks on many systems in many locations. This allows data access to be fast and efficient. This data is stored in tables (i.e. libraries) and organized into rows and columns (i.e. shelves) for easy access.
    Now consider the manual maintenance that the library staff performs on a daily basis. Each item coming into or being checked out of the library is a “transaction”. An item checked out leaves a space and one checked in needs space. The space is in constant motion and may need to grow or shrink rapidly for larger events like adding a new collection or removing all books that are over a certain age.

    An active database processes data in almost the exact same way. Records are added, updated and remove much like books in the library, but much more often and at a much greater rate. Consider what your local library would look like if the staff took a month off and no one was expected to manage the materials. People could simply throw things in where they think they should be. Or all books could be added to the last shelf. What about adding new books? What if the shelves are overloaded? The RDBMS on the system is the librarian of your data and it does a great job, but it isn’t anywhere near as smart as a human nor does it have the time to make management decisions based on things like current or future transaction rates, etc.

    The job of the RDBMS (Relational DataBase Management System) is to put the data where it belongs. And they do this very well. If your database never changes, or only appends data to the end, this isn’t an issue. Most databases are updated at random locations at distributed times. Disk space is a lot like library space. It is fixed and the material you have has to fit within it and you have to be able to find what you need at any given moment. Due to the rigid rules enforced by the RDBMS, your data will always be in order, but it may be a little hard to find.

    Consider the book just returned to the library. The system shows it has been returned, but it isn’t in the proper location on the shelf. The librarian knows the book is in the library, but has to take time to locate it. The RDBMS works in a very similar way. When a new record is inserted and there isn’t a place for it where it belongs, a link is made to the location of the data on disk. When you want to retrieve it, the system looks where the data should be and finds that it must go elsewhere to find it. Over time, that same piece of data may have been updated yet again and a link is made to the location of the NEW data. Again, this may be happening very quickly to many records. Over time, even today’s fast systems can slow down spending more and more time looking for data that isn’t well organized.

    The library staff spends a lot of time maintaining the order and space of their materials with intelligence. The RDBMS doesn’t have this luxury and must be instructed to go through and clean itself up. For most platforms the only way to accomplish such a task is to close the library, take all the materials out and put them back in order. This is a monumental undertaking and can take a very long time. Thanks to the wise developers of the Tandem corporation back in the 1970’s and 1980’s, we have enjoyed the ability to perform maintenance on our library (tables) without suffering the outage that keeps most RDBMS’s from calling themselves NonStop.

    Reorganizing the data in the database is based on how much data is stored and how often it is changed. For a large database with frequent updates, this can be a constant, ongoing process that may never complete. For most, it is an operation that takes place during off peak hours and keeps the database in an efficient, performing state. For every database, it is a necessity.

    The issue gets a little more involved on the NonStop as well because the architecture promotes breaking large database objects into smaller parts for performance. On the NonStop, this has been known as partitioning since the late 1980’s. Kids today refer to this by a new, fancy name: MapReduce. On other platforms, this may introduce massive complexity, but for us each database partition is simply another object that requires attention. NonStop SQL (both MX and MP) treat partitioned objects as singular logical entities for query purposes, but from a maintenance point of view, each partition is a standalone object that requires individual attention. A larger table may have a few partitions or hundreds. Each partition may contain different amounts of data and may require maintenance on a different schedule.
    Now that you know that you may have a lot of disparate database objects that require constant attention, don’t forget that the original NonStop record manager (Enscribe) also suffers from the exact same maintenance requirements, but usually on a smaller scale.

    Let’s get technical. It’s time for some definitions.
    If you plan to read further, we should probably define some terms used to refer to NonStop disk objects. The comment in () at the end is the library equivalent of the definition.

    NonStop disk objects:

    For the purposes of this article, a NonStop disk object is any key sequenced Enscribe file or SQL table or index.

    DP2 (or the disk process):

    DP2 is the disk process that reads and writes data to disk. DP2 understands the structured format of the data on disk. On other platforms, it may be known as a driver. (The librarian)

    Block (from the Enscribe Programmers Guide):

    A block is the unit of information transferred between the disk process and the disk. A block consists of one or more logical records and associated control information. A record cannot span block boundaries (that is, it cannot begin in one block and end in another). The block size of a key-sequenced file should be large in relation to the record size, and especially so in relation to the key size, to reduce the number of block splits as records are inserted into the file. Furthermore, a larger data block implies more data records per block and therefore fewer index records and fewer index blocks. (Book shelf)

    Index block:

    Index blocks tell the disk process where specific data can be found. (The call numbers at the end of a book shelf)

    Extent (from the Enscribe Programmers Guide):

    When you create a NonStop disk object, you can specify the maximum amount of physical disk space to be allocated for that object. Physical space is allocated in the form of extents. An extent is a contiguous block of disk space that can range in size from a single page (2048 bytes) to 65,535 pages (134,215,680 bytes) for format 1 files or to 536,870,912 pages for format 2 files. (The size of a shelf, or the number of books it can hold, secondary extents are additional shelves added when the current shelves are full)

    Table (from the NonStop SQL/MX Glossary):

    A logical representation of data in a database in which a set of records is represented as a sequence of rows, and the set of fields common to all the records is represented as a series of columns. The intersection of a row and column represents the data value of a particular field in a particular record. As a database object, a table defines data in columns and defines the physical characteristics of the table. (Sections. I.e. non-fiction, periodicals, etc)

    Primary key (from the NonStop SQL/MX Glossary):

    A column or set of columns that define the uniqueness constraint for a table. (The Dewey Decimal value of the book in question. How you locate an item)

    Index (from the NonStop SQL/MX Glossary):

    An alternate access path (alternate key) to a table that differs from the primary access path (clustering key) defined for the table at creation time. An index, stored in a key-sequenced file, includes columns for the clustering key and the alternate key. (Same books, but in a different order. Instead of by Dewey Decimal number, they are ordered by size and shape, or by author only)

    Slack:

    The amount of free space between records in a key sequenced table/index.

    Block Split (from the Enscribe Programmers Guide):

    The position of a new record inserted into a key-sequenced file is determined by the value of its primary-key field. If the block where a new record is to be inserted into a file is full, a block split occurs. This means that the disk process allocates a new data block, moves part of the data from the old block into the new block, and gives the index block a pointer to the new data block.

    Fragmentation (from Wikipedia):

    A phenomenon in which storage space is used inefficiently, reducing capacity and often performance. Fragmentation leads to storage space being “wasted”, and the term also refers to the wasted space itself.

    Defragmentation:

    The actions operations personnel take to remove any wasted space and make disk storage and access contiguous and well-ordered within individual NonStop disk objects. Commonly referred to as defrag(ing), reload(ing), or reorg(ing).

    How does a reorg work? What does it do?
    There are three basic types of fragmentation that have an impact on database performance and need to be addressed: disorganized data chains, poor space utilization and over allocated extents.

    Disorganized data is a major cause of database performance degradation and it is often overlooked. Disorganization can occur any time a database is updated. Inserts, delete and updates can force the database to move data around due to changes in the physical length of records. The system does what it can with what it has, but sooner or later the need will arise for a record to be in between two others where there simply is not space. On the NonStop, the data will be written to a location with space and the address of the new location will be put in between the two other records so the system can find it when necessary. When this occurs, the disk process has to go where the data should be only to find out it has to go elsewhere to get it. In the worst case, the system will perform a block split where the disk process has to physically move data around to make room for more data. Any one of these operations on its own doesn’t appear to be much of an issue, but think back to the librarian. What if each time you wanted to check out a book, someone else had to find what you want for you in a big pile of disorganized books?

    Poor space utilization comes into play when the space between records is inefficiently used. Most commonly caused by deletes, but this may occur during update as well. If you start off with a database of 10,000 records and delete the first 9,999 of them without maintenance, the database may still look ( from the outside) like it contains all 10,000 records.

    The over allocated extent scenario occurs a lot as well, but is also less visible to most. In this case, over time the database has required more and more space. To obtain this space, the disk process will grab more disk space (a secondary extent) to store data. Since this data may not be contiguous to the original data, there is overhead incurred by having to locate data in the secondary extents when a search is done.

    Reorganizing the database simply does exactly what it states: it re-organizes the data in a database. As outlined above, the data in a database can, and will, get disorganized. When you reorganize the database, you ask the RDBMS (or the disk process) to examine all the data in a given object and put it back in order. To accomplish this on the Nonstop, the system has to read all the data, in order, and then put it back into the same container while allowing updates at the same time! This may not appear that difficult, but it is. The disk process does all the work for you. The process is intelligent enough to use space already allocated to the object to store the data while in flight to keep from having to allocate more space. The blocks of data are read in logical order and written back to the disk in physical order. This ensures the fastest access to the data. If there is empty space in the existing primary extent for the object, the space is re-used. If this means that data can be moved from secondary extents back into the primary extent, then the secondary (possibly non-contiguous) extent could be emptied and released.

    The best performance comes from a database where the records are in order and enough space exists between records (slack) for growth. Every object has its own optimal values and all require constant monitoring and tuning.

    How do I know if my data requires reorganization?
    The longer your database exists, the more disorganized it gets. By adding and removing records, changing the data in variable length fields, and performing other routine tasks you steadily degrade the physical layout of data. The file utility program can give you some idea by simply reporting the amount of slack (or free) space in an object or partition. This information is useful, but not deterministic. A better way is to walk the data chains using a tool designed for this purpose. In this way, you can tell not only how much space is left in the object, but how many data blocks are in order (or are “chained”), how much data in each block, how organized (or disorganized) the index blocks are and more. Good tools will allow you to examine single partitions of an object, or the object as a whole. Keeping in mind that database reloads are done on single partitions at a time. Better tools do all this and they do it fast by examining samples of the database as opposed to reading every single bit.

    What can be done?
    The simple fact is: if you have a database, it needs attention. The greater number of parts it consists of, the more attention it needs. Not all databases are large and not all large databases get a lot of fragmentation, but every database requires maintenance over time. If you don’t know if your database needs maintenance, than you are already behind and likely suffering from performance degradation. Find a tool to help you identify and manage your database maintenance and sleep just that much better at night knowing your database is not only NonStop, but performing at its peak.