Qube 7.5 Complete Release Notes

################################################################= ##############
#
# Qube Release Notes
#
###############= ###############################################################

########################################################################= ######
@RELEASE: 7.5-0

=3D=3D=3D=3D CL 22602 =3D=3D=3D=3D
@NEW: add one more file for init= ialization of the central preferences database, qubedb_prep_0054.sql, which= creates the "prefs" DB user.

JIRA: QUBE-3795

=3D=3D=3D=3D CL 22597 =3D=3D=3D=3D
@CHANGE: implemented code to mak= e changes to postgresql.conf and pg_hba.conf files on installation, needed = to support the new central preferences feature.

JIRA: QUBE-3795

=3D=3D=3D=3D CL 22596 =3D=3D=3D=3D
@CHANGE: "pfx" account= 's default password changed to a longer one (for new installations, except = on Linux)

JIRA: QUBE-3667

=3D=3D=3D=3D CL 22593 =3D=3D=3D=3D
@NEW: add SQL to initialize the = central preferences database.

JIRA: QUBE-3795

=3D=3D=3D=3D CL 22592 =3D=3D=3D=3D
@FIX: init_supe_db.py: make all = calls to the "psql" command using the DB owner

=3D=3D=3D=3D CL 22458 =3D=3D=3D=3D
@CHANGE: point PYTHONPATH to $QB= DIR/lib/python3.8 before supervisor service is started, for its embedded py= thon interpreter (Linux, macOS)

=3D=3D=3D=3D CL 22288 =3D=3D=3D=3D
@NEW: add "disable_central_= prefs" flag to supervisor_flags

JIRA: QUBE-3778

=3D=3D=3D=3D CL 22230 =3D=3D=3D=3D
@FIX: a bunch more fixes to make= python-based backends to work properly with python3 while maintaining pyth= on2 compatibility.

Now if a python-based jobtype's job.conf specifies "execute_binding= =3D Python" or "execute_binding =3D Python3", python3 will = be used. If "execute_binding =3D Python2", python2 is used.

JIRA: QUBE-3747

=3D=3D=3D=3D CL 22190 =3D=3D=3D=3D
@CHANGE: Switch supervisor's emb= edded Python interpreter to python3.8.

JIRA: QUBE-2762, QUBE-3749

=3D=3D=3D=3D CL 22179 =3D=3D=3D=3D
@CHANGE: made Linux installation= (RPM and DEB for CentOS/RHEL and Ubuntu, repectively) require "python= 3"

JIRA: QUBE-3767

=3D=3D=3D=3D CL 22177 =3D=3D=3D=3D
@CHANGE: convert python-based jo= btypes (appFinder, pyCmdline, pyCmdrange) qube/types/ from python2 to pytho= n3

JIRA: QUBE-3747

=3D=3D=3D=3D CL 22176 =3D=3D=3D=3D
@CHANGE: convert example python = scripts in qube/examples/python from python2 to python3

JIRA: QUBE-3746

=3D=3D=3D=3D CL 22175 =3D=3D=3D=3D
@CHANGE: convert python scripts = in qube/scripts from python2 to python3

JIRA: QUBE-3745

=3D=3D=3D=3D CL 22174 =3D=3D=3D=3D
@CHANGE: convert python scripts = in qube/utils from python2 to python3

JIRA: QUBE-3745

=3D=3D=3D=3D CL 22081 =3D=3D=3D=3D
@FIX: "pfx" user to be= created w/o a home directory now, in the install_supervisor script, which = is used to do some initialization on DEB-based Linux platforms (i.e. Ubuntu= ).

Was previously set up to create a home dir, causing the DEB qube-supeinstallation to exit prematurely when the root user doesn't have writepermissions to create "/home/pfx" (e.g. NFS-mounted /home).

Now the "useradd" command points to "/var/tmp" as th= e pfx user's homedir.

=3D=3D=3D=3D CL 22080 =3D=3D=3D=3D
@NEW: add perl 5.30 support (for= Ubuntu 20.04 LTS)

JIRA: QUBE-3721

=3D=3D=3D=3D CL 22077 =3D=3D=3D=3D
@FIX: "pfx" user to be= created w/o a home directory now, in the install_supervisor script, which = is used to do some initialization on RPM-based Linux platforms.

Was previously set up to create a home dir, causing the RPM qube-supeinstallation to exit prematurely when the root user doesn't have writepermissions to create "/home/pfx" (e.g. NFS-mounted /home).

Now the "useradd" command points to "/usr/tmp" as th= e pfx user's homedir.

=3D=3D=3D=3D CL 22057 =3D=3D=3D=3D
@NEW: Ubuntu 18.04 and 20.04 sup= port

JIRA: QUBE-3720, QUBE-3721

=3D=3D=3D=3D CL 22039 =3D=3D=3D=3D
@NEW: Add Python 3.7 (standard) = and 3.8 (homebrew) API support on macOS

=3D=3D=3D=3D CL 22035 =3D=3D=3D=3D
@NEW: Add Python 3.6, 3.7, and 3= .8 API support for Windows.

JIRA: QUBE-2762

=3D=3D=3D=3D CL 22030 =3D=3D=3D=3D
@NEW: Add python 3.6, 3.7, and 3= .8 Qube API support on Linux.

=3D=3D=3D=3D CL 21802 =3D=3D=3D=3D
@CHANGE: macOS to build Qube cor= e with Qt 5.14.2

JIRA: QUBE-3688

=3D=3D=3D=3D CL 21801 =3D=3D=3D=3D
@CHANGE: remove python 2.6 suppo= rt from all platforms

JIRA: QUBE-3691

=3D=3D=3D=3D CL 21800 =3D=3D=3D=3D
@CHANGE: Linux to build with Qt5= .14.2

JIRA: QUBE-3688

=3D=3D=3D=3D CL 21769 =3D=3D=3D=3D
@NEW: add Python 3.8 compatibili= ty to the main Qube Python API , including its supporting .py scripts.

JIRA: QUBE-2762

=3D=3D=3D=3D CL 21731 =3D=3D=3D=3D
@FIX: Fix crash when no options = are given. Now usage message is printed when no args are present.
@CHA= NGE: Made the "checks" arguments instead of options.
@INTER= NAL: refactored a bunch of stuff.

JIRA: QUBE-3206

=3D=3D=3D=3D CL 21644 =3D=3D=3D=3D
@FIX: not all child processes of= job instances sometimes not dying properly when parent thread dies

ZD: 20225

=3D=3D=3D=3D CL 21641 =3D=3D=3D=3D
@FIX: not all child processes of= job instances sometimes not dying properly when parent thread dies

ZD: 20225

=3D=3D=3D=3D CL 21556 =3D=3D=3D=3D
@FIX: fixed problem where a job = can get stuck in "dying" state due to a timing-related issue.

This was causing, among other things, global resources to not be release= d properly.

ZD: 20307

=3D=3D=3D=3D CL 21432 =3D=3D=3D=3D
@CHANGE: install_supervisor scri= pt: install Data W/H DB by calling $QBDIR/datawh/install_datawarehouse_db.s= h

=3D=3D=3D=3D CL 21385 =3D=3D=3D=3D
@TWEAK: Don't give up on the fir= st error in enableRequiredPrivileges(), but try enabling all privileges. Al= so print number of errors.

=3D=3D=3D=3D CL 21360 =3D=3D=3D=3D
@FIX: Agressively preempted fram= es can get missed and left in "pending" while instances all finis= h

ZD: 20177

=3D=3D=3D=3D CL 21351 =3D=3D=3D=3D
@CHANGE: add SE_DEBUG_NAME to li= st of privileges to be enabled; also add more info to print to workerlog

* add SE_DEBUG_NAME to the list of privileges to be enabled
* print= WARNING when OpenProcess() fails in cleanup(), and the reason
* add i= nstance name to print in more output lines when available/applicable

=3D=3D=3D=3D CL 21242 =3D=3D=3D=3D
@FIX: On host reboot, supervisor= needs to start after postgresql is started and ready

* added code to check the DB connection at supervisor's boot time, and r= etry after 10 seconds, up to 6 attempts (1 minute),
effectively dela= ying the supervisor boot until after the DB is ready.

JIRA: QUBE-3637

=3D=3D=3D=3D CL 21239 =3D=3D=3D=3D
@NEW: add a way to tell qbjobinf= o() API routine to only query for and pull selective job data (aka "co= lumns" or "fields").

Developers using the Qube C++ and/or Python API can now tell qbjobinfo()= routine (qb.jobinfo() for Python) to only query
for and pull selecti= ve job data (aka "columns" or "fields"), for leaner, me= aner, more economical queries.

* Add support for explicitly specifying needed fields in C++ API's qbjob= info().
* Add support for explicitly specifying needed fields in Pytho= n API's qb.jobinfo(), a la 6.10's direct query API.
* Also add "-= fields" option to qbjobs
* qbjobs now makes leaner queries by def= ault (unless an option to display details is specified, like "-long&qu= ot; or "-notes")

[Examples]

C++:
QbString query_fields_str =3D "id,username,status";<= br />QbStringList query_fields;
QbExpression::split(query_fields_str, = query_fields);
QbQuery query;
for(int i =3D 0; i < query_field= s.length(); i++) {
QbField *f =3D new QbField(*query_fields.get(i));<= br /> query.fields().push(f);
}
QbJobList jobs;
qbjobinfo(qu= ery, jobs)

Python:
jobs =3D qb.jobinfo(fields =3D ['id','username','status'])<= /p>

JIRA: QUBE-3623
ZD: 19955

=3D=3D=3D=3D CL 21234 =3D=3D=3D=3D
@FIX: Add timeout for agenda-bas= ed jobs stuck in "running" status, in a "waiting" loop.=

TL;DR:
Sometimes, agenda-based job instances can get stuck in the &= quot;running" state, in a "wating" loop. A timeout, currentl= y hardcoded to 60 seconds, has been added to force those jobs to break out = of the loop.

Details:
Sometimes, agenda-based job instances can get stuck in a &= quot;wating" loop, with messages like the following repeating indefini= tely in the job's stdout:

[Dec 20, 2019 18:23:46] HOSTNAME[47572]: requesting work for: 424805.0 [Dec 20, 2019 18:23:46] HOSTNAME[47572]: got work: -1: - waiting
= [Dec 20, 2019 18:23:46] HOSTNAME[47572]: INFO: informing worker[127.0.0.1]=
INFO: told to wait & retry from supe-- sleeping for [7] seconds<= /p>

A job instance stuck in this state can tie up a worker's job slot(s) unt= il it is manally intervened with (killed, migrated, etc), or
until it= hits its "subjob timeout" (assuming the job was setup with it).<= /p>

This issue, newly introduced in 7.x, has been found to happen due to rac= e conditions.

It is particularly likely to occur when the following conditions are met= :
* jobs have the migrate_on_frame_retry job flag set AND they use ret= rywork/retrysubjob
* job instances fail quickly (i.e. job process/rend= erer crashes and exits quickly)
* there are idle workers

(There are other scenarios that this can also happen, such as when aggre= ssive preemption is done
rapidly, but there's normally not many idle = workers when preemptions do happen, so it's less likely.)

In a nutshell:
* instance fails on a worker
* supe detect the = failure, migrates and starts the instance on a new worker
* the new wo= rker reports the instance now "running"
* the first worker f= inishes cleaning up and reports that the instance is now "pending"= ;
* instance gets stuck in a "wating" loop on the new worke= r.

A timeout, currently hardcoded to 60 seconds, has been added to force th= ose jobs to break out of the infinite loop.

ZD: 19977, 20094, 19967
JIRA: QUBE-3638

=3D=3D=3D=3D CL 21217 =3D=3D=3D=3D
@FIX: Add timeout for agenda-bas= ed jobs stuck in "running" status, in a "waiting" loop.=

Details:
Sometimes, agenda-based job instances can get stuck in a &= quot;wating" loop, with messages like the following repeating indefini= tely in the job's stdout:

This issue, newly introduced in 7.x, has been found to happen due to rac= e conditions.

A timeout, currently hardcoded to 60 seconds, has been added to force th= ose jobs to break out of the infinite loop.

ZD: 19977, 20094, 19967
JIRA: QUBE-3638

=3D=3D=3D=3D CL 21060 =3D=3D=3D=3D
@FIX: bug where jobs passively p= reempted while working on the final agenda item don't complete properly but= go "pending" with the agenda items 100% done

JIRA: QUBE-3626
ZD: 19967

Also:
@TWEAK: made IP address to also print, in addition to the hos= tname, when qbwrk.conf for a host is loaded (QbSupervisor::loadWrkConfig())= .
Now prints something like:
loaded config for host: hostname (= aaa.bbb.ccc.ddd)

=3D=3D=3D=3D CL 21059 =3D=3D=3D=3D
@INTERNAL CHANGE/FIX: Removed th= e special, undocumented "feature" where the "str" passe= d to "QbField::value(str)" may optionally be prefixed with a spec= ial character to specify an operator (or "sign") to be applied fo= r the field.

The special char was one of [~,=3D,>,<,%,*], and was setting "= ;sign" to a
string representing the ASCII value of the op charact= er ("itoa" value, for
example '%' to "37").

Instead of just fixing that, we're ditching this special feature, as it = is
unnecessary and confusing. The operator can always be specified exp= licitly
by calling QbField::sign(str). A scan of the qube code base di= dn't reveal
any use of this feature, but it's possible that peripheral= code (namely
python apps, like WV or AV) may be using it (but I highl= y doubt it).

=3D=3D=3D=3D CL 20817 =3D=3D=3D=3D
@FIX: fix_mysql_column_orders.sq= l: add back AUTO_INCREMENT to columns ('id' or 'uq') of the following MySQL= tables: globalcallback, globalevent, jobid, lostevent

The ALTER TABLE done to modify column orders on these tables were wiping=
the AUTO_INCREMENT of these tables' specified columns. This was in tu= rn
causing issues (job submissions failing, for example) when downgrad= ing 7.0
to 6.10.

ZD: 19833

=3D=3D=3D=3D CL 20683 =3D=3D=3D=3D
@FIX: qubeproxy on Linux does no= t use the same standard default password as it does on macOS/Windows

JIRA: QUBE-613

=3D=3D=3D=3D CL 20681 =3D=3D=3D=3D
@NEW: Add ability to lock/unlock= workers by MAC address

JIRA: QUBE-243

=3D=3D=3D=3D CL 20658 =3D=3D=3D=3D
@CHANGE: Use of FK (foreign keys= ) in Postgres for job removal.

* Optimized job removal.
* Added utils/pgsql/qubedb_0054.sql which = will "ALTER TABLE" (via init_supe_db.py) the relevant job-related= tables.

JIRA: QUBE-3319

=3D=3D=3D=3D CL 20650 =3D=3D=3D=3D
@FIX: Setting a *_flags param to= an empty string should mean "no flags", not "default flags&= quot;

Specifically, the supervisor_job_flags would not behave as expected, and=
would take on the default values when specified as:

supervisor_job_flags =3D ""

supervisor_job_flags =3D

JIRA: QUBE-620

=3D=3D=3D=3D CL 22713 =3D=3D=3D=3D
@CHANGE: Disabling "query&q= uot; in supervisor_verbosity by default, to avoid overcrowding the supelog = with the frequent (2 or more per second) "job/host query received"= ; messages generated by the new supervisorProxy queries

##################################################################= ############
#
# The Qube! Supervisor Proxy Release Notes
#<= br />######################################################################= ########

The Qube! Supervisor Proxy:

Watches for changes to jobs and workers on the Qube! supervisor and push= es
those changes out to the Qube! UI clients running on the network. = This
reduces load on the supervisor in large farms and the UIs get in= terative
updates for the jobs and workers panels.

########################################################################= ######
@RELEASE: 7.5-0
##########################################= ####################################

What's new:

The Qube! Supervisor Proxy is now available on = all Qube! supported
platforms.