Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle Shared Pool Internals: List Chunk Position in the LRU List

There was a discussion at Oracle-L about shared pool chunk eviction and I thought I’d show how to see a shared pool memory chunk’s position in the LRU list yourself. This is a post mostly about Oracle internals and less about practical every day troubleshooting.
There’s a view X$KGHLU (@kghlu.sql script) that shows you a summary of shared pool (sub)heap allocations and how many of these chunks are in the “transient list” (used once) vs.

Oracle Imp (import) to RDS

Had and old export dump file I wanted to import into RDS.

I first tried to use S3 as a hosting site for the .dmp file and use rdsadmin.rdsadmin_s3_tasks.download_from_s3 to copy the file to the RDS database then import it with DBMS_DATAPUMP.

Unfortunately DBMS_DATAPUMP only works with expdp files so instead of using DBMS_DATAPUMP , I got the old “imp” binary and used it with TWO_TASK to the RDS database I wanted it loaded into.

For the record here were the steps  I took

 

First put the file into an S3 bucket called

kylelf-datadump

Create an IAM policy to allow Amazon RDS access to an Amazon S3 bucket

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/oracle-s3-integra...

My policy looks like

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Sid": "VisualEditor0",
 "Effect": "Allow",
 "Action": [
 "s3:PutObject",
 "s3:GetObject",
 "s3:ListBucket"
 ],
 "Resource": "arn:aws:s3:::kylelf-datadump"
 }
 ]
}

Create an IAM role to allow Amazon RDS access to an Amazon S3 bucket

again, see: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/oracle-s3-integra...

add the role to the database you want to import into

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.
    Choose the Oracle DB instance name to display its details.
    On the Connectivity & security tab, in the Manage IAM roles section, choose the role to add under Add IAM roles to this instance.
    For Feature, choose S3_INTEGRATION.

create s3 option group

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithOptionGroups.html#USER_WorkingWithOptionGroups.Create

  1. create an option group
  2. add s3 integration to option group
  3. modify instance, change option group to s3 option group

Then download the dump file from S3.

SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(
 p_bucket_name => 'kylelf-datadump', 
 p_directory_name => 'DATA_PUMP_DIR') 
 AS TASK_ID FROM DUAL; 

TASK_ID

——————————————————————————–

1583185124793-43

Check status of download

SELECT text FROM
table(rdsadmin.rds_file_util.read_text_file(‘BDUMP’,’dbtask-1583185124793-43.log’));

Check to see the file is there

select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('DATA_PUMP_DIR')) order by mtime;
FILENAME-  TYPE     FILESIZE MTIME
---------- ---------- ---------
TC1.dmp  file     39829504 02-MAR-20

My file is the TC1.dmp file

Now import it

DECLARE
hdnl NUMBER;
BEGIN
hdnl := DBMS_DATAPUMP.OPEN( operation => 'IMPORT', job_mode => 'SCHEMA', job_name=>null);
DBMS_DATAPUMP.ADD_FILE( handle => hdnl, filename => 'TC1.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file);
DBMS_DATAPUMP.METADATA_FILTER(hdnl,'SCHEMA_EXPR','IN (''KYLELF'')');
DBMS_DATAPUMP.START_JOB(hdnl);
END;
/

This gave me the error

ORA-39001: invalid argument value
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 3507
ORA-06512: at "SYS.DBMS_DATAPUMP", line 3756
ORA-06512: at line 5

Not very informative.

 Found an example to get more verbose error messaging on

https://stackoverflow.com/questions/45348688/oracle-datapump-export-error-invalid-argument-value

SET VERIFY OFF
SET FEEDBACK OFF
SET SERVEROUTPUT ON

DECLARE
  ind        NUMBER;
  fileType   NUMBER;
  value      VARCHAR2(2048);
  infoTab    KU$_DUMPFILE_INFO := KU$_DUMPFILE_INFO();
 
BEGIN
  --
  -- Get the information about the dump file into the infoTab.
  --
  BEGIN
    DBMS_DATAPUMP.GET_DUMPFILE_INFO('TC1.dmp','DATA_PUMP_DIR',infoTab,fileType);
    DBMS_OUTPUT.PUT_LINE('---------------------------------------------');
    DBMS_OUTPUT.PUT_LINE('Information for file: TC1.dmp');
 
    --
    -- Determine what type of file is being looked at.
    --
    CASE fileType
      WHEN 1 THEN
        DBMS_OUTPUT.PUT_LINE('TC1.dmp is a Data Pump dump file');
      WHEN 2 THEN
        DBMS_OUTPUT.PUT_LINE('TC1.dmp is an Original Export dump file');
      WHEN 3 THEN
        DBMS_OUTPUT.PUT_LINE('TC1.dmp is an External Table dump file');
      ELSE
        DBMS_OUTPUT.PUT_LINE('TC1.dmp is not a dump file');
        DBMS_OUTPUT.PUT_LINE('---------------------------------------------');
    END CASE;
 
  EXCEPTION
    WHEN OTHERS THEN
      DBMS_OUTPUT.PUT_LINE('---------------------------------------------');
      DBMS_OUTPUT.PUT_LINE('Error retrieving information for file: ' ||
                           'TC1.dmp');
      DBMS_OUTPUT.PUT_LINE(SQLERRM);
      DBMS_OUTPUT.PUT_LINE('---------------------------------------------');
      fileType := 0;
  END;
 
  --
  -- If a valid file type was returned, then loop through the infoTab and 
  -- display each item code and value returned.
  --
  IF fileType > 0
  THEN
    DBMS_OUTPUT.PUT_LINE('The information table has ' || 
                          TO_CHAR(infoTab.COUNT) || ' entries');
    DBMS_OUTPUT.PUT_LINE('---------------------------------------------');
 
    ind := infoTab.FIRST;
    WHILE ind IS NOT NULL
    LOOP
      --
      -- The following item codes return boolean values in the form
      -- of a '1' or a '0'. Display them as 'Yes' or 'No'.
      --
      value := NVL(infoTab(ind).value, 'NULL');
      IF infoTab(ind).item_code IN
         (DBMS_DATAPUMP.KU$_DFHDR_MASTER_PRESENT,
          DBMS_DATAPUMP.KU$_DFHDR_DIRPATH,
          DBMS_DATAPUMP.KU$_DFHDR_METADATA_COMPRESSED,
          DBMS_DATAPUMP.KU$_DFHDR_DATA_COMPRESSED,
          DBMS_DATAPUMP.KU$_DFHDR_METADATA_ENCRYPTED,
          DBMS_DATAPUMP.KU$_DFHDR_DATA_ENCRYPTED,
          DBMS_DATAPUMP.KU$_DFHDR_COLUMNS_ENCRYPTED)
      THEN
        CASE value
          WHEN '1' THEN value := 'Yes';
          WHEN '0' THEN value := 'No';
        END CASE;
      END IF;
 
      --
      -- Display each item code with an appropriate name followed by
      -- its value.
      --
      CASE infoTab(ind).item_code
        --
        -- The following item codes have been available since Oracle
        -- Database 10g, Release 10.2.
        --
        WHEN DBMS_DATAPUMP.KU$_DFHDR_FILE_VERSION   THEN
          DBMS_OUTPUT.PUT_LINE('Dump File Version:         ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_MASTER_PRESENT THEN
          DBMS_OUTPUT.PUT_LINE('Master Table Present:      ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_GUID THEN
          DBMS_OUTPUT.PUT_LINE('Job Guid:                  ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_FILE_NUMBER THEN
          DBMS_OUTPUT.PUT_LINE('Dump File Number:          ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_CHARSET_ID  THEN
          DBMS_OUTPUT.PUT_LINE('Character Set ID:          ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_CREATION_DATE THEN
          DBMS_OUTPUT.PUT_LINE('Creation Date:             ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_FLAGS THEN
          DBMS_OUTPUT.PUT_LINE('Internal Dump Flags:       ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_JOB_NAME THEN
          DBMS_OUTPUT.PUT_LINE('Job Name:                  ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_PLATFORM THEN
          DBMS_OUTPUT.PUT_LINE('Platform Name:             ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_INSTANCE THEN
          DBMS_OUTPUT.PUT_LINE('Instance Name:             ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_LANGUAGE THEN
          DBMS_OUTPUT.PUT_LINE('Language Name:             ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_BLOCKSIZE THEN
          DBMS_OUTPUT.PUT_LINE('Dump File Block Size:      ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_DIRPATH THEN
          DBMS_OUTPUT.PUT_LINE('Direct Path Mode:          ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_METADATA_COMPRESSED THEN
          DBMS_OUTPUT.PUT_LINE('Metadata Compressed:       ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_DB_VERSION THEN
          DBMS_OUTPUT.PUT_LINE('Database Version:          ' || value);
 
        --
        -- The following item codes were introduced in Oracle Database 11g
        -- Release 11.1
        --

        WHEN DBMS_DATAPUMP.KU$_DFHDR_MASTER_PIECE_COUNT THEN
          DBMS_OUTPUT.PUT_LINE('Master Table Piece Count:  ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_MASTER_PIECE_NUMBER THEN
          DBMS_OUTPUT.PUT_LINE('Master Table Piece Number: ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_DATA_COMPRESSED THEN
          DBMS_OUTPUT.PUT_LINE('Table Data Compressed:     ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_METADATA_ENCRYPTED THEN
          DBMS_OUTPUT.PUT_LINE('Metadata Encrypted:        ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_DATA_ENCRYPTED THEN
          DBMS_OUTPUT.PUT_LINE('Table Data Encrypted:      ' || value);
        WHEN DBMS_DATAPUMP.KU$_DFHDR_COLUMNS_ENCRYPTED THEN
          DBMS_OUTPUT.PUT_LINE('TDE Columns Encrypted:     ' || value);
 
        --
        -- For the DBMS_DATAPUMP.KU$_DFHDR_ENCRYPTION_MODE item code a
        -- numeric value is returned. So examine that numeric value
        -- and display an appropriate name value for it.
        --
        WHEN DBMS_DATAPUMP.KU$_DFHDR_ENCRYPTION_MODE THEN
          CASE TO_NUMBER(value)
            WHEN DBMS_DATAPUMP.KU$_DFHDR_ENCMODE_NONE THEN
              DBMS_OUTPUT.PUT_LINE('Encryption Mode:           None');
            WHEN DBMS_DATAPUMP.KU$_DFHDR_ENCMODE_PASSWORD THEN
              DBMS_OUTPUT.PUT_LINE('Encryption Mode:           Password');
            WHEN DBMS_DATAPUMP.KU$_DFHDR_ENCMODE_DUAL THEN
              DBMS_OUTPUT.PUT_LINE('Encryption Mode:           Dual');
            WHEN DBMS_DATAPUMP.KU$_DFHDR_ENCMODE_TRANS THEN
              DBMS_OUTPUT.PUT_LINE('Encryption Mode:           Transparent');
          END CASE;
 
        --
        -- The following item codes were introduced in Oracle Database 12c
        -- Release 12.1
        --
 
        --
        -- For the DBMS_DATAPUMP.KU$_DFHDR_COMPRESSION_ALG item code a
        -- numeric value is returned. So examine that numeric value and
        -- display an appropriate name value for it.
        --
        
      END CASE;
      ind := infoTab.NEXT(ind);
    END LOOP;
  END IF;
END;
/

Now I can see the error better:

39001: ORA-39001: invalid argument value
-39000: ORA-39000: bad dump file specification
-39143: ORA-39143: dump file "/rdsdbdata/datapump/TC1.dmp" may be an original
export dump file

 

Looking this up via google, indicates that this is an old “exp” dump file and is not compatible with “impdp”

Thus have use old imp tool

Off of OTN

https://www.oracle.com/database/technologies/instant-client/linux-x86-64...

I got

oracle-instantclient19.6-basic-19.6.0.0.0-1.x86_64.rpm
oracle-instantclient19.6-sqlplus-19.6.0.0.0-1.x86_64.rpm
oracle-instantclient19.6-tools-19.6.0.0.0-1.x86_64.rpm

Installed these on an EC2

sudo rpm -ivh

sudo rpm -ivh oracle-instantclient19.6-basic-19.6.0.0.0-1.x86_64.rpm
sudo rpm -ivh oracle-instantclient19.6-sqlplus-19.6.0.0.0-1.x86_64.rpm
sudo rpm -ivh oracle-instantclient19.6-tools-19.6.0.0.0-1.x86_64.rpm

Then copy the TC1.dmp file to the EC2

Then use TWO_TASK to point to my database

export TWO_TASK='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=ora19.xxxxxxxx.us-east-1.rds.amazonaws.com)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcl)))‘

/usr/lib/oracle/19.6/client64/bin/imp user/pw file=TC1.dmp FROMUSER=’scott’ TOUSER=’kylelf’

also had a hickup because the first database I used was 11.2 and the imp is from 19.0 distribution so got error

 

IIMP-00003: ORACLE error 942 encountered
ORA-00942: table or view does not exist
IMP-00023: Import views not installed, please notify your DBA
IMP-00000: Import terminated unsuccessfully

So bad to create a version 19 Oracle database and then the import worked

 

Developing Applications with CockroachDB… end-to-end example with SpringBoot, Java, Hibernate, Gradle

overview

Application development frameworks like SpringBoot have done a good job a giving developers a quick start to developing applications with Java. Layering object-relational mapping to SQL to remove proprietary syntax has further simplified the development of database applications.

CockroachDB allows developers to take advantage in the evolution of globally distributed data with a simple SQL interface. But each database can bring it’s own specific syntax and cause developers to hesitate when getting started with a new database.

How do you get started?

This blog will use SpringBoot, Hibernate, JAVA, and Gradle to show a simple end-to-end solution with CockroachDB.

demystifying cockroach datasource

If you have developed an application with Postgres, developing with Cockroach will be a breeze. CockroachDB was written to be compatible with the Postgres 9.6 Dialect. The connect string URL used to connect to CockroachDB is plug compatible with JDBC drivers for Postgres. This allows you to use various to tools like dbeaver to connect and run various SQL statements.

For ORMs like Hibernate, you connect just like you were connecting to Postgres. Currently, Hibernate is compatible with PostgreSQL95Dialect. So to connect to CockroachDB, the following should be configured in the application.properties file with SpringBoot:

spring.datasource.url=jdbc:postgresql://localhost:26257/defaultdb?sslmode=disable
spring.datasource.driver-class-name=org.postgresql.Driver
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQL95Dialect

That’s really it!

The rest of this blog will walk you through the steps with SpringBoot quick start.

Step 1 — Quick Start with SpringBoot

Open the quick start to create the basic framework required. For this example used the following options:

  • Project: Gradle Project
  • Language: Java
  • SpringBoot: 2.2.4
  • Project Metadata
    • group: example.com
    • Artifact: first_cockroach /w Options…
      • name: first_cockroach
      • Description: Demo project for Spring Boot
      • Package name: com.example.first_cockroach
      • Packaging: JAR, Java 8

The page should look like so:

Select the hamburger drop-down under dependencies to add the following items:

  • Developer Tools
    • Spring Boot DevTools
    • Lombok
  • Web
    • Rest Repositories
  • SQL
    • Spring Data JPA
    • Flyway Migration
    • PostgreSQL Driver

and …

Once these are selected, you can simply click to generate the quick start package. If you use the artifact name above, you should have a first_cockroach.zip file to get started.

Step 2 — unzip artifact and add connectivity resource

Unzip the artifact created in Step 1 in your working environment and navigate the to src/main/resources directory. Once in this directory, create the application.properties file that defines how to connect to the database and migrate data with flyway.

### Setup Postgres driver
spring.datasource.driver-class-name=org.postgresql.Driver
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQL95Dialect
spring.datasource.username=root
spring.datasource.password=
spring.datasource.type=com.zaxxer.hikari.HikariDataSource
### Connect URL for localhost with port 26257 /w insecure connection
### Database name is defaultdb
spring.datasource.url=jdbc:postgresql://localhost:26257/defaultdb?sslmode=disable
### Set baseline-on-migrate with Flyway
spring.flyway.baseline-on-migrate=true
### Set baseline-version for Flyway to execute migrations at version 1 or more
spring.flyway.baseline-version=0

For this simple test, I created a simple cockroach test cluster on my laptop. For an even easier demo cluster, you can simply run cockroach demo.

Step 3 — create flyway migration file

Simply create the appropriate flyway migration files in the src/main/resources/db/migration directory. For this simple test, I created a file V1__AddPetsTable.sql to create and populate the pets table.

--##
--##     Flyway Migration File
--##         src/main/resources/db/migration/V1__AddPetsTable.sql
--##
--## Create pets table
--##
CREATE TABLE pets
(
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
name STRING,
type STRING,
indoor BOOLEAN DEFAULT TRUE
);
--##
--## Define the herd
--##
INSERT INTO pets (name, type)
VALUES ('tula', 'cat'),('ruby','dog'),('rosie','dog');
INSERT INTO pets (name, type, indoor)
VALUES ('virgil', 'squirrel', false);

In a typical project, this is where Cockroach specific syntax is used to define tables and indexes to utilize features like Geo-partitioning, Duplicate-Indexes, and Inverted Indexes.

Step 4 — entities, controllers, repositories, and services

Spring will have created the src/main/java/com/example/first_cockroach directory with the FirstCockroachApplication.java file as a starting point for this project. Within this directory, create directories to be used to define various objects and services.

cd src/main/java/com/example/first_cockroach
mkdir entities
mkdir controllers
mkdir repositories
mkdir services

Step 5 — Create Entities for Pets

Now the table is defined it we can create an object that maps to the pets table. Create the Pets.java file in the src/main/java/com/example/first_cockroach/entities directory.

//
// Pets.java
//
package com.example.first_cockroach.entities;
//
import lombok.Data;
//
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import java.util.UUID;
//
@Entity(name = "PETS")
@Data
public class Pets {
@Id
@GeneratedValue
private UUID id;
private String name;
private String type;
private Boolean indoor = true;
}

Step 6 — Create Controller for Pets

This controller defines how to insert data into the pets table via a restful controller. Create the PetsController.java file in the src/main/java/com/example/first_cockroach/controllers directory.

// PetsController.java
//
package com.example.first_cockroach.controllers;
//
import com.example.first_cockroach.entities.Pets;
import com.example.first_cockroach.services.PetsService;
import org.springframework.data.rest.webmvc.RepositoryRestController;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;
//
import javax.servlet.http.HttpServletRequest;
import java.net.URI;
//
import static org.springframework.web.bind.annotation.RequestMethod.POST;
//
@RepositoryRestController
public class PetsController {
//
private final PetsService petsService;
//
public PetsController(PetsService petsService) {
this.petsService = petsService;
}
//
@RequestMapping(method = POST, value = "/pets")
public @ResponseBody ResponseEntity<?> createPets(@RequestBody Pets pet, HttpServletRequest request) {
Pets createdPet = petsService.save(pet);
//
URI createdUri = URI.create(request.getRequestURL().toString() + "/" + createdPet.getId());
return ResponseEntity.created(createdUri).body(createdPet);
}
}

Step 7 — Create Repositories for Pets

This controller defines how to lookup data into the pets table via a restful controller. Create the PetsRepository.java file in the src/main/java/com/example/first_cockroach/repositories directory.

// PetsRepository.java
//
package com.example.first_cockroach.repositories;
//
import com.example.first_cockroach.entities.Pets;
import org.springframework.data.repository.PagingAndSortingRepository;
import org.springframework.data.repository.query.Param;
import org.springframework.data.rest.core.annotation.RepositoryRestResource;
//
import java.util.List;
import java.util.UUID;
//
@RepositoryRestResource(collectionResourceRel = "pets", path = "pets")
public interface PetsRepository extends PagingAndSortingRepository {
//
List findByName(@Param("name") String name);
}

Step 8 — Create Services for Pets

This defines how the Services for Pets. Create the PetsService.java file in the src/main/java/com/example/first_cockroach/services directory.

// PetsService.java
//
package com.example.first_cockroach.services;
//
import com.example.first_cockroach.entities.Pets;
import com.example.first_cockroach.repositories.PetsRepository;
import org.springframework.stereotype.Service;
//
@Service
public class PetsService {
//
private final PetsRepository petsRepository;
//
public PetsService(PetsRepository petsRepository) {
this.petsRepository = petsRepository;
}
//
public Pets save(Pets pet) {
return petsRepository.save(pet);
}
}

Step 9 — Time to RUN!

Now we can boot and run your application with SpringBoot and CockroachDB! Simply type ./gradlew bootRun to run. You should see the following:

./gradlew bootRun
> Task :bootRun
  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::        (v2.2.4.RELEASE)
2020-02-28 11:22:41.907  INFO 4390 --- [  restartedMain] c.e.f.FirstCockroachApplication          : Starting FirstCockroachApplication on MacBook-Pro-9.local with PID 4390 (/Users/glenn/git/misc_projects_glenn/workshop_java_corelogic/java_gradle_glenn_v1/first_cockroach/build/classes/java/main started by glenn in /Users/glenn/git/misc_projects_glenn/workshop_java_corelogic/java_gradle_glenn_v1/first_cockroach)
...
<=========----> 75% EXECUTING [1m 28s]
> :bootRun

Once it is running, open another SQL session to the database and check to see the table has been created and the initial pets have been added.

cockroachdb sql --insecure
root@localhost:26257/defaultdb> select * from pets;
id                  |  name  |   type   | indoor
+--------------------------------------+--------+----------+--------+
333cf0c1-8245-4b90-9f17-6c059de57fb7 | tula   | cat      |  true
6f61408f-9074-4d00-80ac-2e189aacf62c | virgil | squirrel | false
7e4dfb73-9f3d-4e64-aade-1b7a8457ac51 | ruby   | dog      |  true
ce32bd86-6485-4846-af14-55e66eaf792a | rosie  | dog      |  true
(4 rows)

Now let’s try the RESTFUL interface to retrieve the pet data from http://localhost:8080/pets:

curl http://localhost:8080/pets
{
  "_embedded" : {
    "pets" : [ {
      "name" : "tula",
      "type" : "cat",
      "indoor" : true,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/pets/333cf0c1-8245-4b90-9f17-6c059de57fb7"
        },
        "pets" : {
          "href" : "http://localhost:8080/pets/333cf0c1-8245-4b90-9f17-6c059de57fb7"
        }
      }
    }, {
      "name" : "virgil",
      "type" : "squirrel",
      "indoor" : false,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/pets/6f61408f-9074-4d00-80ac-2e189aacf62c"
        },
        "pets" : {
          "href" : "http://localhost:8080/pets/6f61408f-9074-4d00-80ac-2e189aacf62c"
        }
      }
    }, {
      "name" : "ruby",
      "type" : "dog",
      "indoor" : true,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/pets/7e4dfb73-9f3d-4e64-aade-1b7a8457ac51"
        },
        "pets" : {
          "href" : "http://localhost:8080/pets/7e4dfb73-9f3d-4e64-aade-1b7a8457ac51"
        }
      }
    }, {
      "name" : "rosie",
      "type" : "dog",
      "indoor" : true,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/pets/ce32bd86-6485-4846-af14-55e66eaf792a"
        },
        "pets" : {
          "href" : "http://localhost:8080/pets/ce32bd86-6485-4846-af14-55e66eaf792a"
        }
      }
    } ]
  },
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/pets{?page,size,sort}",
      "templated" : true
    },
    "profile" : {
      "href" : "http://localhost:8080/profile/pets"
    },
    "search" : {
      "href" : "http://localhost:8080/pets/search"
    }
  },
  "page" : {
    "size" : 20,
    "totalElements" : 5,
    "totalPages" : 1,
    "number" : 0
  }
 }

Now let’s test inserting data via POST to the pets application:

curl -i -X POST -H "Content-Type:application/json" -d '{"name": "Mazie","type": "dog","inside": "true"}' http://localhost:8080/pets
HTTP/1.1 201
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
Location: http://localhost:8080/pets/45234458-6468-4d24-8a2d-b0dad6b8881d
Content-Type: application/hal+json
Transfer-Encoding: chunked
Date: Fri, 28 Feb 2020 19:42:24 GMT

{
  "id" : "45234458-6468-4d24-8a2d-b0dad6b8881d",
  "name" : "Mazie",
  "type" : "dog",
  "indoor" : true
}

Finally, we can query to the pets table to ensure the data is stored within CockroachDB.

root@localhost:26257/defaultdb> select * from pets;
id                  |  name  |   type   | indoor
+--------------------------------------+--------+----------+--------+
333cf0c1-8245-4b90-9f17-6c059de57fb7 | tula   | cat      |  true
45234458-6468-4d24-8a2d-b0dad6b8881d | Mazie  | dog      |  true
6f61408f-9074-4d00-80ac-2e189aacf62c | virgil | squirrel | false
7e4dfb73-9f3d-4e64-aade-1b7a8457ac51 | ruby   | dog      |  true
ce32bd86-6485-4846-af14-55e66eaf792a | rosie  | dog      |  true
(5 rows)

final thoughts

This was meant as a simple example of how to get started with SpringBoot and Cockroach. If you are developing a highly concurrent application with CockroachDB, you will need to take into consideration coding for Retries with Serializable transactions. This is often done with a retry operation with Springboot to handle this event.

I hope this was useful to show how to get started developing SpringBoot applications that use CockroachDB.

Below is are various links to building Restful applications with SpringBoot, Hibernate, Java, and Gradle. Thanks to Vinh Thai for the guidance with your example and pointers.

Reference Documentation

For further reference, please consider the following sections:

Guides

The following guides illustrate how to use some features concretely:

Video : SQLcl and Liquibase : Automating Your SQL and PL/SQL Deployments

In today‘s video we’ll give a quick demonstration of applying changes to the database using the Liquibase implementation in SQLcl.

The video is based on this article.

You might also find these useful. The secure external password store is a good way to make connections with SQLcl. If you support a variety of database engines, you may prefer to use the regular Liquibase client.

The star of today’s video is Steve Karam from Delphix. </p />
</p></div>

    	  	<div class=

Refactoring procedural to SQL – an example with MySQL Sakila

By Franck Pachot

What I want to show in this blog post is that, as in mathematics where you have to apply some algebra rules to transform an equation to an equivalent one, the database developer must translate the business specification to an equivalent that is optimized (in performance, reliability and readability) for the data model.

I was looking at the Sakila sample database provided with MySQL. It simulates a DVD rental store. For my younger readers who wonder what this is, you can imagine a pre-Netflix generation where, when you want to watch a movie, you read it from a storage media that you bring at home, rather than streaming it through the internet from a distant data center. I’ll write someday about how this Netflix approach, while being more agile (you choose a movie and can watch it immediately – on demand) is a terrible resource consumption design which, in my opinion, is not sustainable. This DVD vs. Netflix example may be good to introduce data gravity, and processing data in the database. But that’s not the topic here. Or maybe it is because I’ll talk about optimization, and stored procedure…

I have installed MySQL in my Oracle Cloud Free Tier instance (Oracle Linux 7) with the following:


sudo yum install -y https://yum.oracle.com/repo/OracleLinux/OL7/MySQL80_community/x86_64/get...
sudo systemctl start mysqld.service
mysql -uroot -p
mysql --connect-expired-password --user=root \
--password=$(sudo awk '/temporary password/{print $NF}' /var/log/mysqld.log) \
-e "alter user 'root'@'localhost' identified by '2020 @FranckPachot';"

Then I have downloaded and run the Sakila sample database installation:


curl https://downloads.mysql.com/docs/sakila-db.tar.gz | tar -xzvf - -C /var/tmp
mysql --user=root --password='2020 @FranckPachot' < /var/tmp/sakila-db/sakila-schema.sql
mysql --user=root --password='2020 @FranckPachot' < /var/tmp/sakila-db/sakila-data.sql

And I was looking at the example in the documentation: https://dev.mysql.com/doc/sakila/en/sakila-usage.html#sakila-usage-rent-a-dvd which starts the “Rent a DVD” use-case by checking if the DVD is available in stock:


mysql> SELECT inventory_in_stock(10);
+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

This function is defined as:


DELIMITER $$
CREATE FUNCTION inventory_in_stock(p_inventory_id INT) RETURNS BOOLEAN
READS SQL DATA
BEGIN
    DECLARE v_rentals INT;
    DECLARE v_out     INT;

    #AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE
    #FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED

    SELECT COUNT(*) INTO v_rentals
    FROM rental
    WHERE inventory_id = p_inventory_id;

    IF v_rentals = 0 THEN
      RETURN TRUE;
    END IF;

    SELECT COUNT(rental_id) INTO v_out
    FROM inventory LEFT JOIN rental USING(inventory_id)
    WHERE inventory.inventory_id = p_inventory_id
    AND rental.return_date IS NULL;

    IF v_out > 0 THEN
      RETURN FALSE;
    ELSE
      RETURN TRUE;
    END IF;
END $$

DELIMITER ;

 
I was really surprised by the complexity of this: 2 queries in a procedural function for something that I expect to be just a simple SQL query. I guess the purpose was to show what can be done within a procedure but it is a very bad example for people starting with SQL. At least the comment is clear:


    #AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE
    #FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED

But the code seems to be a one-to-one translation of this sentence to procedural code. And as I mentioned in the introduction, this may have to be worded differently to be optimized. And we must be sure, of course, that the transformation is a functional equivalent. This is called “refactoring”.

Prefer SQL to procedural language

There are several reasons we should prefer one SQL query, when possible, to procedural code. First, because different languages are usually executed in a different engine, and that means context switch and passing parameters. Second, when you run everything as a SQL query, your modules (should I say microservices?) can be merged as a subquery in the calling statement and the query planner can go further to optimize it as a whole. And then you have the advantage of both worlds: the performance of monoliths with the agility of micro-services.
In addition to that, you may think that procedural code is easier to read and evolve, but it is actually the opposite. This impression comes probably from the way IT is learned at school. Students learn a lot of 3rd generation languages (procedural), as interpreted scripts or compiled programs. When I was at university I also learned some 4th generation (declarative) languages like Prolog and, of course, SQL. But this is, in my opinion, neglected today. The language is not important but the logic is. A declarative language is like a math formula: probably difficult to start with, because of the high level of abstraction. But then, once the logic is understood, it because obvious and error-safe. In today’s developer life, this means: more errors encountered before arriving to a solution, but fewer bugs and side effects once the solution is found.

There’s another reason to prefer one SQL statement rather than many in a procedure. The latter may have its behavior depending on the isolation level, and you may have to think about exception handling. A SQL statement is always atomic and consistent.

Static analysis

I’ll re-write this function as a simple query and, before looking at the dynamic of the procedural language, I’m looking at bounding the scope: the tables and columns used. This will help to define the tests that should cover all possibilities.
The function reads two tables: the INVENTORY (one row per DVD) and RENTAL (one row per rent transaction). On INVENTORY we check only the presence of the INVENTORY_ID. On RENTAL we check the presence of the INVENTORY_ID and we also check the presence of a RETURN_DATE to know if it is back to the stock.

Define the tests

When I refactor an existing function, the improved performance is only the secondary goal. The first goal is to be sure that my new proposal is functionally equivalent to the existing one. Then, it is critical to build the tests to validate this.

I’ll use the same item, inventory_id=10, and update the tables to get all variations. I’ll rollback between each and then I must disable autocommit:

--------------
set autocommit=0
--------------

This DVD, in the initial state of the sample database, has been rented 3 times and is now back in the stock (all rentals have a return date):

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date         | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | 2005-07-08 21:55:51 |        1 | 2006-02-15 21:30:53 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | 2005-08-05 05:12:47 |        2 | 2006-02-15 21:30:53 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | 2005-08-30 17:01:05 |        1 | 2006-02-15 21:30:53 |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+

The INVENTORY_IN_STOCK function returns “true” in this case:

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

My second test simulates a rental that is not yet returned, by setting the last RENTAL_DATE to null:

--------------
update rental set return_date=null where inventory_id=10 and rental_id=15218
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date         | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | 2005-07-08 21:55:51 |        1 | 2006-02-15 21:30:53 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | 2005-08-05 05:12:47 |        2 | 2006-02-15 21:30:53 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | NULL                |        1 | 2020-02-29 22:27:15 |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+

There, the INVENTORY_IN_STOCK returns “false” because the DVD is out:

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      0 |
+------------------------+

My third test simulates a case that should not happen: multiple rentals not returned. If this exists in the database, it can be considered as corrupted data because we should not be able to rent a DVD that was not returned from the previous rental. But when I replace a function with a new version, I want to be sure that the behavior is the same even in case of a bug.

--------------
rollback
--------------

--------------
update rental set return_date=null where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | NULL        |        1 | 2020-02-29 22:27:16 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | NULL        |        2 | 2020-02-29 22:27:16 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | NULL        |        1 | 2020-02-29 22:27:16 |
+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      0 |
+------------------------+

This returns “false” for the presence in the stock and I want to return the same in my new function.

I will not test the case where rentals exists but the DVD is not in the inventory because this situation cannot happen thanks to referential integrity constraints:


--------------
delete from inventory
--------------

ERROR 1451 (23000) at line 130 in file: 'sakila-mysql.sql': Cannot delete or update a parent row: a foreign key constraint fails (`sakila`.`rental`, CONSTRAINT `fk_rental_inventory` FOREIGN KEY (`inventory_id`) REFERENCES `inventory` (`inventory_id`) ON DELETE RESTRICT ON UPDATE CASCADE)

The next test is about an item which never had any rental:

--------------
rollback
--------------

--------------
delete from rental where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

This means that the DVD is available in the stock:


--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

One situation remains: the item is not in the inventory and has no rows in RENTAL:

--------------
delete from inventory where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

--------------
select * from rental where inventory_id=10
--------------

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

For this one, the INVENTORY_IN_STOCK defined in the Sakila sample database returns “true” as if we can rent a DVD that is not in the inventory. Probably the situation has a low probability to be encountered because the application cannot get an INVENTORY_ID if the row is not in the inventory. But a function should be correct in all cases without any guess on the context where it is called. Are you sure that some concurrent modification cannot encounter this case: one user removing a DVD and another accepting to rent it at the same time? That’s the problem with procedural code: it is very hard to ensure that we cover all cases that can happen with various loops and branches. With declarative code like SQL, because it follows a declarative logic, the risk is lower.
For this refactoring, I’ll make sure that I have the same result even if I think this is a bug. Anyway, it is easy to add a join to INVENTORY if we want to fix this.

--------------
rollback
--------------

Single SQL for the same logic

The function returns “true”, meaning that the item is in-stock, when there are no rows in RENTAL (never rented) or if all rows in RENTAL have the RETURN_DATE populated (meaning that it is back to stock).

The Sakila author has implemented those two situations as two queries but that is not needed. I’ll show a few alternatives, all in one SQL query.

NOT EXISTS

“All row in RENTAL have RETURN_DATE not null” can be translated to “no row in RENTAL where RETURN_DATE is null” and that can be implemented with a NOT EXISTS subquery. This also covers the case where there’s no row at all in RENTAL:

select not exists (
 select 1 from rental where rental.inventory_id=10
 and return_date is null
);

This returns the same true/false value as the INVENTORY_IN_STOCK function. But the parameter for inventory_id must be present in the query (the value 10 here) and this still needs to call a procedure because MySQL has no parameterized views.

NOT IN

A similar query can use NOT IN to return “false” as soon as there is a rental that was not returned:

select (10) not in (
 select inventory_id from rental where return_date is null
);

The advantage here is that the parameter (the value 10 here) is outside of the subquery.

VIEW TO ANTI-JOIN

Then, the subquery can be defined as a view where the logic (the return_date being null) is encapsulated and usable in many places:


create or replace view v_out_of_stock as
 select inventory_id from rental where return_date is null;

And we can use it from another query (probably a query that gets the INVENTORY_ID by its name) with a simple anti-join:


select (10) not in (
 select inventory_id from v_out_of_stock
);

VIEW TO JOIN

If you prefer a view to join to (rather than anti-join), you can add the INVENTORY table into the view. But then, you will have a different result in the case where the inventory_id does not exist (where I think the Sakila function is not correct).


create or replace view v_inventory_instock as
select inventory_id from inventory
where inventory_id not in (
 select inventory_id from rental where return_date is null
);

Then is you are coming from INVENTORY you can simply join to this view:


select '=== not same in view',(10) in (
 select inventory_id from inventory natural join v_inventory_instock
);

I used a natural join because I know there is only one column, but be careful with that. In doubt just join with a USING clause.

Of course this may read the INVENTORY table two times because the MySQL optimizer does not detect that this join can be eliminated:


explain
select * from inventory natural join v_inventory_instock where inventory_id=10
--------------

+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key                 | key_len | ref   | rows | filtered | Extra                   |
+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+
|  1 | SIMPLE      | inventory | NULL       | const | PRIMARY             | PRIMARY             | 3       | const |    1 |   100.00 | NULL                    |
|  1 | SIMPLE      | inventory | NULL       | const | PRIMARY             | PRIMARY             | 3       | const |    1 |   100.00 | Using index             |
|  1 | SIMPLE      | rental    | NULL       | ref   | idx_fk_inventory_id | idx_fk_inventory_id | 3       | const |    3 |   100.00 | Using where; Not exists |
+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+

However, this is still cheaper than calling a function. And you may prefer to create the in-stock view with all the columns from INVENTORY and then don’t have to join to it.

COUNT

If you prefer a COUNT rather than an EXISTS like what was done in the Sakila function, you can compare the number of rentals with the number of returns:


select count(rental_date)=count(return_date) from rental where inventory_id=10 ;

This works as well with no rentals as the count is at zero for both. However, leaves less possibilities to the optimizer. With a ‘NOT IN’ the scan of all rentals can stop as soon as a row is encountered. With a COUNT all rows will be read.

Performance

As the Sakila function is designed, taking an INVENTORY_ID as input, there are good chances that the application calls this function after a visit to the INVENTORY table.

You can test this with the original Sakila sample database with just the additional following view which lists the inventory_id with the in-stock status


create or replace view v_inventory_stock_status as
select inventory_id, inventory_id not in (
 select inventory_id from rental where return_date is null
) inventory_in_stock
from inventory
;

I have run the following to get the execution time when checking the in-stock status for each inventory item:


time mysql --user=root --password='2020 @FranckPachot'  -bve "
use sakila;
select count(*),inventory_in_stock(inventory_id) 
from inventory 
group by inventory_in_stock(inventory_id);"
--------------
select count(*),inventory_in_stock(inventory_id) from inventory group by inventory_in_stock(inventory_id)
--------------

+----------+----------------------------------+
| count(*) | inventory_in_stock(inventory_id) |
+----------+----------------------------------+
|     4398 |                                1 |
|      183 |                                0 |
+----------+----------------------------------+

real    0m2.272s
user    0m0.005s
sys     0m0.002s

2 seconds for those 4581 calls to the inventory_in_stock() function, which itself has executed one or two queries.

Now here is the same group by using an outer join to my view:


time mysql --user=root --password='2020 @FranckPachot'  -bve "
use sakila;
select count(inventory_id),inventory_in_stock
from inventory
natural left outer join v_inventory_stock_status
group by inventory_in_stock;"

select count(inventory_id),inventory_in_stock
from inventory
natural left outer join v_inventory_stock_status
group by inventory_in_stock
--------------

+---------------------+--------------------+
| count(inventory_id) | inventory_in_stock |
+---------------------+--------------------+
|                4398 |                  1 |
|                 183 |                  0 |
+---------------------+--------------------+

real    0m0.107s
user    0m0.005s
sys     0m0.001s

Of course, the timings here are there only to show the idea.

https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/03/Annotat... 898w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/03/Annotat... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/03/Annotat... 1260w" sizes="(max-width: 263px) 100vw, 263px" />

You can test the same in your environment, and profile it to understand the overhead behind the recursive calls to the procedure and to its SQL statements. Stored procedures and functions are nice to provide an API to your database. It is ok to execute one per user call. But when you get the parameters from a SQL statement (INVENTORY_ID here) and call a function which itself executes some other SQL statements you go to poor performance and non-scalable design. When you can refactor the functions to replace them by views you keep the same maintainability with reusable modules, and you also give to the SQL optimizer a subquery that can be merged in a calling SQL statement. A join will always be faster than calling a function that executes another SQL statement.

The idea here is common to all databases, but other database engines may have better solutions. PostgreSQL can inline functions that are defined in “language SQL” and this can be used as a parameterized view. Oracle 20c introduces SQL Macros which are like a SQL preprocessor where the function returns a SQL clause rather than computing the result.

Cet article Refactoring procedural to SQL – an example with MySQL Sakila est apparu en premier sur Blog dbi services.

Follow the link…

I’ve blogged on Medium for about one year. Now that I’m back at dbi services and I’ll continue to blog there. Here is the RSS feed:

Note that I’m also there on Twitter and LinkedIn and happy to receive comments, feedback, and questions. And blog posts are also replicated at http://www.oaktable.net/ of course.

So…

The exit directions in the 27km LHC tunnel…

*** TRACE CONTINUES IN https://blog.dbi-services.com/

Databases are slow right?

Man, it gets my goat when people start pontificating nonsense on Twitter.

Yeah, I know, I know, I shouldn’t get upset, because 99% of everything on social media is nonsense anyway, but it is when people roll out claims about database tech that I tend to get tetchy Smile

Today it’s that familiar old chestnut: “Constraints in the database make it slow“. And…this one even came with some numbers add some apparent weight!

image

Now truth be told, I’m not sure what “QPS” is a reference to here, because if it is “queries per second”, then the number of foreign keys and check constraint is not going to impact the rate at which you can retrieve data from the database. If anything, it might make retrieval faster because you are giving the optimizer more information.

So lets assume we’re talking about transactions per second here. Let’s explore this claim with a real demo. I’ll create a table to hold some sales transactions, and each sale will have a number of attributes all relating back to parent tables.



--
-- Some parent tables
--
SQL> create table product as select rownum pid, 'product '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table product add primary key ( pid) ;

Table altered.

SQL>
SQL> create table customer as select rownum cid, 'customer '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table customer add primary key ( cid) ;

Table altered.

SQL>
SQL> create table store as select rownum sid, 'store '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table store add primary key ( sid) ;

Table altered.

SQL>
SQL> create table taxcode as select rownum tid, 'taxcode '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table taxcode add primary key ( tid) ;

Table altered.

--
-- And my sales table that refers back to them
--
SQL> create table sales
  2  ( txn_id int default txnseq.nextval not null ,
  3    pid int not null references product(pid),
  4    cid int not null references customer(cid),
  5    sid int not null references store(sid),
  6    tid int not null references taxcode(tid),
  7    discount varchar2(1) not null,
  8    amount number(10,2) not null,
  9    qty   number(10) not null,
 10    constraint chk1 check ( discount in ('Y','N')),
 11    constraint chk2 check ( amount > 0),
 12    constraint chk3 check ( qty > 0)
 13  );

Table created.

SQL>
SQL> alter table sales add primary key ( txn_id )
  2  using index global partition by hash ( txn_id ) partitions 8;

Table altered.

By my count, I’ve got

  • 8 constraints to check for null-ness
  • 3 check value constraints
  • 1 primary key constraint
  • 4 foreign key constraints

Wow! This thing is constrained to kingdom come! I really should not have bothered proceeding any further because surely that many constraints makes this database just totally unusable for anything.  (Yes, I know sarcasm is a sign of poor character…but well, I never claimed to have good character Smile)

Anyway, I’ll persist on what must surely be a hopeless cause. I’ll create a little routine that picks products, stores, customers and tax codes at random, and uses that values to hammer away transactions at my SALES table.



SQL> create table hammer_log ( job int, ela interval day to second );

Table created.

SQL>
SQL> create or replace
  2  procedure hammer_time is
  3    p sys.odcinumberlist := sys.odcinumberlist();
  4    c sys.odcinumberlist := sys.odcinumberlist();
  5    s sys.odcinumberlist := sys.odcinumberlist();
  6    t sys.odcinumberlist := sys.odcinumberlist();
  7
  8    timer timestamp;
  9  begin
 10    select trunc(dbms_random.value(1,100000))
 11    bulk collect into p
 12    from dual connect by level <= 32000;
 13
 14    select trunc(dbms_random.value(1,100000))
 15    bulk collect into c
 16    from dual connect by level <= 32000;
 17
 18    select trunc(dbms_random.value(1,100000))
 19    bulk collect into s
 20    from dual connect by level <= 32000;
 21
 22    select trunc(dbms_random.value(1,100000))
 23    bulk collect into t
 24    from dual connect by level <= 32000;
 25
 26    timer := localtimestamp;
 27    for j in 1 .. 3 loop
 28
 29      for i in 1 .. 32000
 30      loop
 31        insert into sales ( pid,cid,sid,tid,discount,amount,qty)
 32        values (p(i), c(i), s(i), t(i), 'Y',i,i);
 33        commit;
 34      end loop;
 35    end loop;
 36
 37    insert into hammer_log values ( sys_context('USERENV','BG_JOB_ID'), localtimestamp-timer);
 38    commit;
 39  end;
 40  /

Procedure created.

Then I’ll use the scheduler to simulate a connection pool throwing transactions at my table. We’ll record elapsed time for 96,000 transactions per session in our HAMMER_LOG table and see how much this poor suffering bucket of constraints performs.



SQL> declare
  2    j int;
  3  begin
  4    for i in 1 .. 12
  5    loop
  6       dbms_job.submit(j,'hammer_time;');
  7    end loop;
  8    commit;
  9  end;
 10  /

PL/SQL procedure successfully completed.


SQL> select * from hammer_log;

       JOB ELA
---------- -----------------------
     80999 +00 00:00:19.013000
     81005 +00 00:00:19.005000
     81000 +00 00:00:19.461000
     80998 +00 00:00:19.036000
     81003 +00 00:00:19.248000
     81004 +00 00:00:18.375000
     80996 +00 00:00:19.306000
     81002 +00 00:00:19.286000
     80997 +00 00:00:18.825000
     80995 +00 00:00:18.987000
     81006 +00 00:00:19.018000
     81001 +00 00:00:19.180000

12 rows selected.

SQL> select count(*)*96000 / extract(second from max(ela))
  2  from hammer_log;

COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
                                59195.3137

Hmmm….not 500 “qps”.  No, not 5000 “qps” either.

In fact, with all that constraint “baggage”, I’m still lobbing nearly 60,000 transactions per second at this table!

I know what you’re thinking …. “He’s using a 48-core Exadata machine”

Well, I didn’t mention that …. because I’m not!

I’m using a 2016 Dell XPS laptop. According to ebay, it’s worth about $600. (That’s not my exact laptop, but its the first one I saw that looked close Smile).

image

But this isn’t about price..this is about putting fictional claims on Twitter about database to rest.

“Hold on!” comes your rebuttal.. “What about indexes?”

Contrary to popular belief, just because you have foreign keys defined on columns on a table does not mean that you must index that column.  Some times you might need to, but they generally are a fairly niche set of circumstances. But I don’t want my readers thinking I’m pulling a fast one here, so let’s take our SALES table and index those foreign keys up the wazoo, and then redo the benchmark.



SQL> create index sales_ix1 on sales ( pid);

Index created.

SQL> create index sales_ix2 on sales ( sid);

Index created.

SQL> create index sales_ix3 on sales ( cid);

Index created.

SQL> create index sales_ix4 on sales ( tid);

Index created.
[re-run the 12 jobs]
SQL> select count(*)*96000 / extract(second from max(ela))
  2  from hammer_log;

COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
                                36080.5342

1 row selected.

What a bummer…Only 36,000 transactions per second now on my 4 year old laptop.

So yes, I will fully concede that

  • if you’ve chosen not to educate yourself on foreign key indexing, AND
  • you bought your database server off ebay from an aussie trying to offload a 4 year old laptop, AND
  • you have constraints everywhere, AND
  • you need more than 36,000 transactions per second…

then yes, you might need to look at options. But probably the first option to explore is not to jump onto Twitter and claim that databases don’t scale.

Case closed.

Following Various Legal Action Regarding “Oracle Cloud Revenue”

As a courtesy, I would like to provide a copy of the latest legal document filed in an action being brought against Oracle leadership resulting from alleged, um,  improprieties in revenue reporting over the last few years. There is nothing sacred about this document. Google can find it for you just as easily.

Click on the following link to download the PDF: click here.

From the table of contents:

#000000;" src="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png?w=500&h=624" alt="" width="500" height="624" srcset="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png?w=500&h=624 500w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png?w=120&h... 120w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png?w=241&h... 241w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png?w=768&h... 768w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-1.png 878w" sizes="(max-width: 500px) 100vw, 500px" />

 

#000000;" src="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png?w=500&h=625" alt="" width="500" height="625" srcset="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png?w=500&h=625 500w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png?w=120&h... 120w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png?w=240&h... 240w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png?w=768&h... 768w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-2.png 878w" sizes="(max-width: 500px) 100vw, 500px" />

#000000;" src="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png?w=500&h=524" alt="" width="500" height="524" srcset="https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png?w=500&h=524 500w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png?w=143&h... 143w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png?w=286&h... 286w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png?w=768&h... 768w, https://kevinclosson.files.wordpress.com/2020/02/oracle-ca-3.png 879w" sizes="(max-width: 500px) 100vw, 500px" />

SchedLat: a Low Tech Script for Measuring Process CPU Scheduling Latency on Linux

As you may know, I like to use low tech scripts when possible, ideally such ones that don’t even require you to be root. I prefer simple and “boring” tools as a starting point simply because of the production reality at my customers in traditional enterprise IT. Systems where RHEL6 (and clones) seem to be the most common distros (with Linux kernel 2.6.32) and it’s not too unusual to see RHEL5 occasionally either.

SchedLat: a Low Tech Script for Measuring Process CPU Scheduling Latency on Linux

As you may know, I like to use low tech scripts when possible, ideally such ones that don’t even require you to be root. I prefer simple and “boring” tools as a starting point simply because of the production reality at my customers in traditional enterprise IT. Systems where RHEL6 (and clones) seem to be the most common distros (with Linux kernel 2.6.32) and it’s not too unusual to see RHEL5 occasionally either.