##############################################################################
@RELEASE: 7.0-2d
This maintenance release includes several key fixes (especially on Windows)
and some changes to the supervisor, core, and worker, and is a recommended
update to all 7.0-x customers.
 
==== CL 20637 ====
@FIX: install_supervisor script: remove attempt (which fails) to install Data W/H DB by calling $QBDIR/datawh/install_datawarehouse_db.sh
 
==== CL 20631 ====
@FIX: "qbadmin s --config" doesn't return "supervisor_idle_threads" when that parameter is commented out in qb.conf
 
JIRA: QUBE-2653
 
==== CL 20622 ====
@FIX: qbjobs "-flags" option broken
 
JIRA: QUBE-3520
 
==== CL 20611 ====
@CHANGE: removed submission check for zero workers (i.e., jobs now submit fine when there are no workers on the farm)
 
==== CL 20584 ====
@FIX: minor memory leak fix in PCRE usage in QbRegEx.cpp.
 
==== CL 20582 ====
@FIX: add back perl 5.10 and 5.12 support for CentOS/RHEL 6.x .
 
ZD: 19384
 
==== CL 20496 ====
@FIX: modified the way job instances are killed on Windows workers. The old way was causing issues (i.e. jobs not kill-able) in some environments where the worker is running in service mode AND GPO is used.
Some related symptoms/indications included:
 
* Seeing the following error message in the workerlog:
ERROR: QbWorker::killJob(), PostThreadMessage() Invalid thread identifier
 
* "qblock --purge <worker>" not killing off locally running job instances as expected.
 
* Aggressive preemptions not working as expected.
 
Note: this was previously worked around by tweaking the "UI0Detect" (aka
Interactive Services Detection service) setting in the registry, but
Windows 10 update 1803 removed the service, thus disabling this workaround.
 
ZD: 19473, 19457, 19485
 
==== CL 20432 ====
@NEW: added "-chunk <int>" option to job_cleanup.py script, to allow specifying of the query size
 
ZD: 19307
 
==== CL 20431 ====
@NEW: add (exposed) new options to Perl API qb.jobinfo(): minid, maxid, limit, and orderby
 
ZD: 19307
 
==== CL 20409 ====
@TWEAK: Python API: modified what gets printed when JobValidator.validate is called with verbose flag set
 
==== CL 20407 ====
@NEW: add new options to qbjobs: -limit, -minid, -maxid, -orderby
@FIX: qbjobs: made "-u" option an alias to "–user"
 
JIRA: QUBE-3492
 
==== CL 20356 ====
@FIX: Windows 10: proxy program crashes near end of instance execution, causing it to become "failed"
On some Windows 10 environments, proxy.exe would crash close to the very end of execution, causing any job
to fail, even though the job process ran fine.
 
ZD: 19224
JIRA: QUBE-3493
 
==== CL 20354 ====
@NEW: add --dbowner=name option so that init_supe_db.py may be used to initialize the DB using a user other than the default "pfx"
Also added "-n|--noexec" option, to allow dry-runs.
 
JIRA: QUBE-3471
ZD: 19069
 
==== CL 20347 ====
@FIX: add code to fix column ordering of Qube DB tables in MySQL, which may have gotten mixed up if the DB was ever upgraded, and can cause the data transfer from MySQL to PGSQL to fail
 
JIRA: QUBE-3474, QUBE-3475
ZD: 19055
 
==== CL 20346 ====
@FIX: minor, mostly cosmetic error in the logic for callback execution code where it determined whether a given callback language was disabled or not

 

##############################################################################
@RELEASE: 7.0-2c

This maintenance release includes a few key fixes to the supervisor, core, and
worker, and is a recommended update to all 7.0-x customers.

Specific to Windows, it also includes a change where all modules and binary
files have been rebuilt against a version of Perl that has been built in-house
from source.

IMPORTANT: A note about Perl versions on Windows workers
--------------------------------------------------------

Perl v5.26 is the only supported version on Windows, and must be installed on
all workers. All other versions of Perl are unsupported.


==== CL 20342 ====
@CHANGE: made sure installed Perl versions other than 5.26 are detected as "unsupported" and properly rejected as such (Windows).

==== CL 20338 ====
@FIX: Proxy program was responding to orders (such as migrate/interrupt/preempt) prematurely, before the operations had completed,
causing race coditions in some (rare) cases, resulting in issues such as agenda-based job instances going into an infinite "wait" loop.

ZD: 19066

==== CL 20303 ====
@FIX: bug in a couple of SQL queries causing issues, such as tons of unnecessary extra auto-complete operations.

==== CL 20302 ====
@FIX: qblogin password registriation failing with "ERROR: passwords don't match" on Windows 10

ZD: 18963
JIRA: QUBE-3439

 

 

##############################################################################
@RELEASE: 7.0-2b
This is a supervisor-only release that fixes a critical bug introduced in
7.0-2a. Recommended for any site currently using a 7.0-x supervisor.
==== CL 20255 ====
@FIX: issue introduced in 7.0-2a supe where ageda-based jobs will not dispatch to instances.
==== CL 20252 ====
@NEW: add README-DB-VERSIONING.txt file, for informational purposes only.
JIRA: QUBE-3190
 

 

##############################################################################
@RELEASE: 7.0-2a

##############################################################################

This is a supervisor-only release that includes a few key fixes, and is
recommended for any site currently using a 7.0-x supervisor.

==== CL 20233 ====
@FIX: timeout value set via the API routine qbsettimeout() now respected more accurately

ZD: 18524
JIRA: QUBE-3181

==== CL 20223 ====
@FIX: race-condition can dispatch the same agenda item to multiple instances

ZD: 18980

==== CL 20222 ====
@FIX: "running_monitor" background thread/routine would sometimes try to verify instance-worker combinations that don't exist in reality (and sometimes even non-existent instances), then put those instances back to "pending".

ZD: 18960

==== CL 20218 ====
@FIX: supervisor install fails on Windows where .py files are not associated with the python interpreter

==== CL 20215 ====
@FIX: issue where Supe fails to register dependencies of some jobs at submission, due to DB ERROR: duplicate key value

JIRA: QUBE-3427
ZD: 18957

 

##############################################################################
@RELEASE: 7.0-2

##############################################################################

==== CL 20172 ====
@FIX: Multiple false matches for 'regex_outputPaths' when an instance is preempted

==== CL 20151 ====
@FIX: qbhosts, qbjobs, qblock, and qbmodify commands' "-group" option not working as expected.

Note: required a change to both the client programs as well as the supervisor (had a incompatible SQL statement).

JIRA: QUBE-3413

==== CL 20131 ====
@CHANGE: add code to check supervisor_max_threads relative to the DB server's max_connections, and adjust it if it's too large.

"Too Large" is: supervisor_max_threads > max_connections - 25

JIRA: QUBE-3396

==== CL 20115 ====
@FIX: Calling qbstdout or qbstderr API (c++ or Python API) with "pos=-N" on a file of N - 1 bytes or smaller can crash supervisor process

JIRA: QUBE-3145

==== CL 20113 ====
@FIX: qbjobs should be case-insensitive for user name; qbhosts should be case-insensitive for host names

JIRA: QUBE-3320, QUBE-3386

==== CL 20108 ====
@FIX: fix issue where Qube 7 Perl API module doesn't load on Linux, causing Perl based jobs such as the Maya jobtype job not be able to run.

Error messages:
Can't load '/usr/local/pfx/qube//api/perl/qb.so' for module qb: libQt5Core.so.5: cannot open shared object file: No such file or directory at /usr/lib64/perl5/DynaLoader.pm line 190.

Added /etc/ld.so.cof.d/qube-x86_64.conf file to the qube-core package, to add the path /usr/local/pfx/qube/lib/Qt to the runtime library search path

JIRA: QUBE-3403
ZD: 18920

==== CL 20102 ====
@FIX: 'DB ERROR: relation "joblog" does not exit' message in supelog

Fixed issue where ERROR messages like the following would show in the supelog:

QbDatabasePostgreSQL::_raw() DB ERROR: lastError=ERROR: relation "joblog" does not exist
LINE 1: SELECT id, jobid, subid, type, data, host FROM joblog WHERE...

JIRA: QUBE-3186
ZD: 18899

==== CL 20094 ====
@FIX: Qube API module for python 2.6 (_qb26.pyd) not loading

JIRA: QUBE-3387

==== CL 20081 ====
@FIX: On the Mac, the "pfx" user is created every time the supervisor package is installed or upgraded (add_pfx_account_osx.sh script).

 

##############################################################################

 @RELEASE: 7.0-1c

##############################################################################

------------------------------------------------------------------

Added support for CentOS-7.4 and CentOS-7.5 in this release

------------------------------------------------------------------

==== CL 20071 ====
@FIX: agenda items failing due to timeout won't auto-retry

ZD: 18874

==== CL 20070 ====
@FIX: 7.0-1 supe installer won't work on platforms with python 2.6 (issue with init_supe_db.py script)

JIRA: QUBE-3379

==== CL 20028 ====
@FIX: querying supervisor config never returns database_port unless it's explicitly defined

JIRA: QUBE-3357

==== CL 20025 ====
@FIX: extremely inaccurate cumulative cpu time for agenda items

JIRA: QUBE-3375
ZD: 18841

 

##############################################################################
@RELEASE: 7.0-1a

##############################################################################

==== CL 20002 ====
@NEW: add new filtering options to Python API's qb.jobinfo():

  • submittedBefore
  • submittedAfter
  • updatedBefore
  • updatedAfter

Args specifying time may be in seconds since Unix epoch, or a datetime.date or datetime.datetime object.

Examples:

import qb
qb.jobinfo(updatedAfter=1530680480, updatedBefore=1530680700)
import datetime
weekago = datetime.date.today() - datetime.timedelta(days=7)
qb.jobinfo(submittedAfter=weekago)
JIRA: QUBE-3353

==== CL 19999 ====
@NEW: add new filtering options to qbjobs command: submittedBefore, submittedAfter, updatedBefore, updatedAfter. Option arg specifying time needs to be in seconds since Unix epoch.

JIRA: QUBE-3353

 

##############################################################################
@RELEASE: 7.0-1

##############################################################################

This release includes key fixes to the supervisor, core and WV, among other fixes, and is

strongly recommended for any site currently using 7.0-0 or above.

==== CL 19967 ====

@FIX: include support for earlier versions of Perl. Now we support Perl 5.14 thru 5.26.

==== CL 19964 ====
@CHANGE: adjust (Windows) perl support to versions 5.14 thru 5.26

JIRA: QUBE-3368

==== CL 19957 ====
@FIX: update perl API's copy of IPRC::Run module to properly support Perl 5.26

==== CL 19938 ====
@FIX: Python 2.6 module (_qb26.pyd) was not included. Python modules were named with a .dll extension, instead of the more preferred .pyd

JIRA: QUBE-3358

==== CL 19937 ====
@FIX: Python API .py files missing and/or incorrectly located w/o proper directory structure under QBDIR/api/lib/python

JIRA: QUBE-3358

==== CL 19935 ====
@CHANGE: ignore if database_port is set to 3300 or 3306 and revert to the default, 50055.

JIRA: QUBE-3359

==== CL 19934 ====
@FIX: init_supe_db.py: add version-aware execution of qubedb_xxxx.sql files

JIRA: QUBE-3336

==== CL 19933 ====
@FIX: add important missing INDEXes to the qube.subjob DB table for performance boost.

==== CL 19931 ====
@FIX:Fix relative movie paths in images_to_move.py

==== CL 19915 ====
@CHANGE: remove direct SQL access from job cleanup script
@CHANGE: find_corrupt_jobs.py removed from qube installation, it checked for the existence of tables which are no longer in the Qube 7 schema

==== CL 19897 ====
@FIX: auto_remove worker flag missing from worker config dialogs

==== CL 19888 ====
@FIX: issue where Windows password couldn't be updated.

==== CL 19849 ====
@CHANGE: add the postgres-based database checks back into the supervisor installers

==== CL 19836 ====
@NEW: convert database_checks cmdline utility from MySQL to PostgreSQL

 

 

 

 

 

##############################################################################
@RELEASE: 7.0-0a

##############################################################################

==== CL 19831 ====
@FIX: jobs submitted that reserves a global resource never runs

Bug introduced at 7.0-0, where jobs specifying any global resource reservation would be stuck indefinitely in "pending" state.

JIRA: QUBE-3328

 

 

##############################################################################

@RELEASE: 7.0-0

##############################################################################

==== CL 19673 ====
@CHANGE: allow Metered Licensing (ML) with just a valid, unexpired supe license, and no worker licenese

JIRA: QUBE-2823

==== CL 19637 ====
@FIX: Perl API: added proper qb::version() support

==== CL 19636 ====
@NEW: add support for Perl 5.18, 20, 22, 24, and 26 on Windows.

JIRA: QUBE-749

==== CL 19529 ====
@NEW: add paexec.exe and ntrights.exe to aid with proper installation of postgresql server

==== CL 19507 ====
@NEW: add com.pipelinefx.postgresql.plist file to enable launchd support of PostgreSQL DB server on macOS

JIRA: QUBE-3100

==== CL 19504 ====
@NEW: add_pfx_account_osx.sh script that creates the "pfx" account that runs the PostgreSQL DB server.

The script is run from the "postinstall" process of the pkg installer.

JIRA: QUBE-3100

==== CL 19478 ====
@FIX: workers are always "auto-remove"d, even if "auto_remove" is not set in worker_flags.

ZD: 18512
JIRA: QUBE-3174

==== CL 19475 ====
@FIX: issue where instances would be stuck in "QB_PREEMPT_MODE_FAIL", causing the supervisor to tell instances to "wait and retry later" in response to retryWork() indefinitely.

Issue was caused when the preemptJobNetwork() routine determines that the
instance has started but has NOT yet started working on an agenda item, in
which case it would mark the QB_PREEMPT_MODE_FAIL in order to interrupt
(i.e. aggressively preempt) the instance; However, the interrupt was not
being triggered properly.

Issue was apparently introduced in CL19126.

==== CL 19462 ====
@FIX: issue where some daemon (supe/worker) threads exit early, after processing less client requests than specified via max_clients (e.g. 65, not 256).

Early exits should now only happen when "max threads" happened earlier.

==== CL 19457 ====
@TWEAK: add a couple of useful supelog lines to pring in assignjob(), regarding result of calling converseWorker() for dispatch

==== CL 19454 ====
@FIX: add call to sendHostReport() so that a statusHost message is sent to the supe when the worker "received kill order for unassigned job". This should eliminate some of the jobs that stay in "dying"(or allow "kill" of jobs that are stuck in "dying")

==== CL 19443 ====
@NEW:Add KeyShot commandline render script for batch rendering

==== CL 19437 ====
@FIX: workid is not duplicated by QbHistory copy constructor

==== CL 19436 ====
@FIX: "down" workers not always detected properly

JIRA: QUBE-3155
ZD: 18425

==== CL 19425 ====
@FIX: issue when supe thread doesn't hear back from worker during a dispatch. Related to CL19243.

Also fixed an issue (probably harmless) where an extra call to queue.releaseJob() was sometimes made in the findSubjobAndReserveJob() method.

==== CL 19415 ====
@CHANGE: add qbsub support for jobtypes "pyCmdrange" and "pyCmdline".

==== CL 19263 ====
@FIX: log directories for jobs submitted after the utility has been started but before the orphaned log removal is begun are erroneously removed

==== CL 19258 ====
@FIX: not running --use-frm when first-pass repair fails when message has different line-endings than OS X

==== CL 19243 ====
@FIX: add code to avoid mixed-up job instance status when worker-supervisor communications are dropped during job dispatch on an intermittently unreliable network

It was found that network hiccups can cause a worker to not respond to the
supervisor during the dispatch of a job instance, but still start running
the instance anyway. The worker would send the "running" instance report to
the supervisor, which is processed by a separate thread, which updates the
DB, causing a status mix-up.

Added code to detect such situations, and allowed the system to let the job
run (instead of force-removing it from the duty table) on the worker in
question.

Also added error-checking code on the worker side-- if worker detects that
it couldn't respond to the supe for a dispatch order, it will give up on
that job and release resources that it had just reserved for it.

ZD: 17868

==== CL 19236 ====
@FIX: jobs submitted by non-admin user without a specified priority attempt to submit at priority -1

JIRA: QUBE-3015

==== CL 19209 ====
@FIX: "down" workers would not be detected properly by the supervisor even when the supervisor_heartbeat_timeout expired.

ZD: 18057
JIRA: QUBE-3018

==== CL 19178 ====
@FIX: timing issue causing workers to get stuck with job instances.

Issue was seen on a very busy farm with intermittently drops in network
communications, when many supe threads would try to dispatch a single
instance at the same time.

ZD: 17868

==== CL 19164 ====
@CHANGE: On Unix, by default, supe uses a Unix domain socket to connect to the PostgreSQL server, unless the "database_host" parameter is set.

The default value of database_host is "" on Unix (Linux/macOSX), and "localhost" on Windows.

==== CL 19163 ====
@FIX: fix an issue where a worker can sometimes get stuck with a job instance that it's not running any longer

* Issue was seen when job instances are migrated and there are intermittent
networking issues between the supe and worker causing job updates to NO
come thur in an expected, orderly fashion.

ZD: 17868

==== CL 19126 ====
@FIX: on a network with intermittent worker-supe commnuication issues, bad timing can cause job instances to get stuck in "running" state

* In a bunch of routines that handle job-command executions (i.e., migrate,
kill, etc.) in QbSupervisorCommand, add code to do one last check when a
worker is unreachable, to see if the instance still belongs to the worker
before updating the instance on DB. It was found that, since a thread
dealing with down workers can spend quite a long time, sometimes
instances that a worker was processing can be moved off of it and the DB
updated by another thread (for example, assigned and running on another
worker)-- the check is designed to prevent our thread from overwriting
such updates.

ZD: 17868

==== CL 19121 ====

@FIX: job instances cane get into an odd state when dispatch routine doesn't hear back from the worker ("found dead").

Networking hiccups can cause this communication drop, which in turn may
cause job instances to be "stuck" in the running state on a worker, and be
unkillable.

ZD: 17868

==== CL 19118 ====
@FIX: Systemctl unit files for worker and supervisor not installed into correct location

==== CL 19109 ====
@FIX: optimize job cleanup script
@CHANGE: only scan log directories if log removal necessary
@CHANGE: removal of large number of orphaned log directories does not require skipping sanity checks

==== CL 18985 ====
@FIX: 'No database selected' MySQL error when removing ghost jobs
ZD: 17882

==== CL 18911 ====
@TWEAK: add workerlog to show the host's available properties when inspecting a newly dispatched job (when "checking job requirements").

==== CL 18910 ====
@INTERNAL FIX: supervisor patches to help cut down on the number of threads, and reduce chances of repeated worker rejections on some farms due to race-conditions/timing issues.

ZD17713

==== CL 18831 ====
@NEW: add support for retrieval of only a specified range of jobs (IDs, date, N most recent, etc) in the qbjobinfo() API

Changed the "sign" field of QbFilter class to be a QbString rather than a char, to support SQL operators that are longer than a single character, such as ">=", "<>" or "!=".

Added a "limit" and "order_by" fields to the QbQuery class, so that any query can limit the number of jobs returned, and specify the sort order.

Made change to db-support code (QbDatbase.cpp) and supervisor code (QbSupervisorQuery.cpp and Queue) to take advantage of the above changes and implement the desierd range-specific queries.

JIRA: QUBE-2658

==== CL 18822 ====
@FIX: a bug in the startHost() dispatch routine causing the supervisor NOT to always dispatch jobs to workers when they became available.

@INTERNAL: QbServer::printMemUsage() modified to only kick in if QB_DEBUG_SERVER_MEM_USAGE is defined

ZD: 17713

==== CL 18802 ====
@FIX:Correct 'restrictions' variable name and 'Restrictions' label

==== CL 18717 ====
@FIX: Job instances can become unkill-able with QB_PREEMPT_MODE_FAIL internal status

JIRA: QUBE-2819

==== CL 18680 ====
@FIX: supervisor rpm uninstall leaves the mysql/mariadb service in a stopped state instead of restarting it

  • No labels