SQL Server on Linux: Why Do I Have Two SQL Server Processes

When starting SQL Server on Linux why are there two (2) sqlservr processes?

systemctl status mssql-server
mssql-server.service - Microsoft SQL Server Database Engine

CGroup: /system.slice/mssql-server.service
           ├─85829 /opt/mssql/bin/sqlservr <--------- WATCHDOG | MONITOR
└─85844 /opt/mssql/bin/sqlservr <--------- SQLSERVER.EXE

The simple answer is the first entry (85829) is not what you are used to on a Windows system as sqlservr.exe and does not listen for TDS traffic or open database files.  The parent process handles basic configuration activities and then forks the child process.  The parent process (WATCHDOG) becomes a lightweight monitor and the child process runs the sqlservr.exe process.

Hint: Process ids can be reused so do not write scripts looking for the largest process id as the first entry may have a process id larger than the second entry.

In general, it is an unsafe practice to capture a dump from the process encountering the exception or fatal condition.  Instead, the WATCHDOG is able to work safely as an external process.  On Windows, if a process terminates unexpectedly the Watson infrastructure is invoked to capture process dumps and add entries to the event log.  The Linux, sqlservr, WATCHDOG can perform a similar role.  If an unexpected SIGNAL or Fatal Error condition is encountered the WATCHDOG is signaled and can capture a dump.  The Linux process, signal handlers as well as the death signal are established to handle issues.

The default behavior for some Linux signals is to TERMINATE or generate a full process CORE DUMP.  For example:  On Linux a process is not allowed to catch the SIGKILL signal.  If SIGKILL is sent to the child process Linux sends the WATCHDOG the registered death signal, informing the WATCHDOG of the child's exit.   For those signals that default to capturing a full process CORE DUMP, SQL Server on Linux installs alternate signal handlers.  The alternate handlers use PalDumper to capture the process information, which are much smaller than the entire process CORE DUMP.

Bob Dorr – Principle SQL Server Software Engineer

Comments

  • Anonymous
    December 14, 2018
    Very good!I noted other curious thing: On error log, the message "Server process ID is " is different from the operating system process id. I noted this on every sql server linux installation (2017, 2019).Also, the "os_thread_id" on sys.dm_os_threads not reflect same number from operating system (like ps, ,htop, or looking at /proc//tasks filesystem).Can you explain why?
    • Anonymous
      December 18, 2018
      The operating system id is the Linux process but SQLPAL hosts multiple, virtual processes (sqlservr.exe, sqlagent.exe, etc.) The Process id (PID) in the error log is the virtual process id provided by SQLPAL. That is why you can't match it directly to the Linux process id.