You are viewing an old version of this page. View the current version.

    Compare with Current View Page History

    « Previous Version 5 Next »

    ###############################################################################

    @RELEASE: 6.9-2a

    ##############################################################################

    @SUMMARY: 6.9-2a is a patch release of 6.9-2, and includes the following fixes.

    ==== CL 18717 ====
    @FIX: Job instances can become unkill-able with QB_PREEMPT_MODE_FAIL internal status

    JIRA: QUBE-2819

    ==== CL 18351 ====
    @CHANGE: background helper thread improvements

    * limit the number of workers that are potentially recontacted by the background helper routine to 50 per iteration.

    * background thread exits and refreshes after running for approximately 1 hour, as opposed to 24 hours

    ZD: 17124

    ==== CL 18340 ====
    @FIX: allow special characters in job name field at submissions

    JIRA: QUBE-2748

    ==== CL 18324 ====
    @CHANGE: output of "qbadmin s -config" and "qbadmin w -config hostname" now sorted alphabetically.

    JIRA: QUBE-2654

    ==== CL 18285 ====
    @FIX: add better error-checks in cmdrange jobtype's log-parsing code, in case the log file is not readable.

    In some situations, fseek() was causing crashes in the parseFileStream() routine.

    ZD: 17442

    ==== CL 18221 ====
    @FIX: prevent "host.processors" to be unset when jobs are modified.

    JIRA: QUBE-2649

    ==== CL 18157 ====
    @FIX: shortened the timeout for "qbreportwork" when it reports a "failed" work that has migrate_on_frame_retry from 600 seconds to 20.

    This was causing long 10-minute pauses on the job instance when a frame
    fails after exhausting all of its retry counts.

    Original change was made in CL17206, for QUBE-2202/ZD16553.

    ZD: 17447

    ==== CL 18147 ====
    @FIX: Windows worker wouldn't properly release automounted drives at the end of running a job instance

    ZD: 17400

    ==== CL 18001 ====
    @FIX: Pytnon API's qb.ping(asDict=True) was broken when metered licensing was unauthorized, because of the minus sign

    ==== CL 17889 ====
    @CHANGE: job queries requesting for subjob and/or work details now must explicitly provide job IDs.

    Both qbjobinfo() C++ and qb.jobinfo() Python APIs now reject such submissions and return an error.

    For example, the Python call "qb.jobinfo(subjobs=True)" will raise a runtime exception. It must be now called like "qb.jobinfo(subjobs=True, id=12345)" or "qb.jobinfo(subjobs=True, id=[1234,5678])"

    JIRA: QUBE-244

    ==== CL 17863 ====
    @FIX: Qube language callback command "mail-status" wasn't working properly, setting the smtp "TO" field to an incorrect string.

    ==== CL 17858 ====
    @FIX: qb.deleteworkerproperties() and qb.deleteworkerresources() fn should return an error when used with the wrong 2nd arg (must be a list)

    ZD: 16932
    JIRA: QUBE-2381

    ==== CL 17856 ====
    @FIX: misleading "invalid key" error message in supelog when supervisor_max_metered_licenses set to 0

    JIRA: QUBE-2397

    ==== CL 17797 ====
    @FIX: ignore any ethernet interface with "virutal" in its description when detecting the primary MAC address on Windows.

    ZD 17072

    ==== CL 17790 ====
    @FIX: issue where the background helper thread frequently sends 2 or more update requests (QB_MESSAGE_REQUEST_UPDATE) to a single "questionable" worker (i.e., one that has missed enough heartbeats, and potentially down) at once.

    ZD: 17124

    ==== CL 17735 ====
    @FIX: badlogin jobs can't be retried or killed (previously fixed in CL15011, but regressed)

    JIRA: QUBE-642
    ZD: 12699, 17010

    ==== CL 16491 ====
    @NOTES:Add support for AfterEffects point release scheme (2015.3)

     

    ##############################################################################
    @RELEASE: 6.9-2

    ##############################################################################

     

    @SUMMARY: This is a maintenance release of 6.9, and includes a few fixes
    and improvements to 6.9-1. Recommended upgrade for all 6.9 customers.

     

    ##############################################################################

    ==== CL 17763 ====
    Supervisor and worker now use correct startup scripts for CentOS 7+.

    ==== CL 17735 ====
    @FIX: badlogin jobs can't be retried or killed (previously fixed in CL15011, but regressed)

    JIRA: QUBE-642
    ZD: 12699, 17010

     

     

    ##############################################################################

    @RELEASE: 6.9-1

    ##############################################################################

     

    @SUMMARY: This is a maintenance release of 6.9, and includes a number of fixes
    and improvements to 6.9-0. Recommended upgrade for all 6.9 customers.

     

    ##############################################################################


    ==== CL 17696 ====
    @UPDATE: add explanation for "deferTableCreation" to the python qb.submit() API routine.

    JIRA: QUBE-2400

    ==== CL 17692 ====
    @FIX: another memory leak plugged in the startHost()-related routine, startQualifiedJobsOnHost(). This was causing successful itereations of startHost() (i.e., an instance was dispatched to a worker) to cause memory bloats. Among other places, it was affecting the background helper thread (when it does the "requeuing host" routine.

    JIRA: QUBE-2382

    ==== CL 17649 ====
    @FIX: memory leak in preemption code, especially when preemption policy is set to passive or is disabled by the algorithm.

    QUBE: JIRA-2382

    ==== CL 17634 ====
    @FIX: memory leak in one of the host-triggered dispatch routines
    startQualifiedJobsOnHost(), which is called from startHost().

    Among other things, this was bloating the memory usage inside the helper
    routine running in a background thread/process (cleanermain()).

    JIRA: QUBE-2382
    ZD: 16952

    ==== CL 17610 ====
    @FIX: memory corruption that would cause python or perl to crash when the function was called inside jobs.

    JIRA: QUBE-2389

    ==== CL 17595 ====
    @FIX: fixed memory leak in QbPack::store() and storeXML() methods, which were causing, among other things, supervisor threads to bloat when processing large job submissions

    JIRA: QUBE-2382

    ==== CL 17594 ====
    @FIX: plugged a potential memory leak in QbDaemon communication code, affecting all server (supervisor, worker) programs

    JIRA: QUBE-2382

    ==== CL 17593 ====
    @FIX: plugged memory leak in dispatch code

    JIRA: QUBE-2382

    ==== CL 17592 ====
    @FIX: plugged potential memory leak in user permission-check routine, specifically in the group-access check code

    JIRA: QUBE-2382

    ==== CL 17566 ====
    @NEW: qbwrk.conf loading optimization (and thus "qbadmin w -reconfig" speed up) by explictly listing template names and non-existing hostnames in the new [global_config] section

    * added [global_config] section to the qbwrk.conf file, and allow new config parameters "templates" to list all qbwrk.conf template section names, and "non_existent" to list all non-existent hostnames

    * supe skips ip-address resolution for all section names included in "templates" and "non_existent", and all reserved names, i.e.: "global_config", "default", "linux", "osx", and "winnt", thus speeding up the loading of qbwrk.conf file, which in turn speeds up supervisor boot time and "qbadmin w -reconfig" operation.

    JIRA: QUBE-2346

    ==== CL 17540 ====
    @CHANGE: removed unnecessary submit-time check/rejection of omithosts and omitgroups.

    ZD: 16907, 16908
    JIRA: QUBE-2366

    ==== CL 17449 ====
    @FIX: directory deletion during log cleanup can fail if the supervisor is updating the job history file at the same time

    ==== CL 17435 ====
    @FIX: supervisor process handling a qbping request should always reread the license file before replying

    There was a code path that instructs the supe thread to force-read the
    license file, but the read was not happening under certain conditions; the
    code was returning the old cached data if available, or the default count
    of 2 if the cache isn't available.

    * add a few more informational lines to print to the supelog at license
    re-reading.

    JIRA: QUBE-2317

    ==== CL 17422 ====
    @FIX: make formatting and object instantiation compatible with Python 2.6

    ==== CL 17416 ====
    @FIX: remove unnecessary error message in the schema upgrade routine

    JIRA: QUBE-2283

    ==== CL 17414 ====
    @CHANGE: Add more text to describe the subtle yet significant difference between "retry" and "requeue" Python API routines

    JIRA: QUBE-2049

    ==== CL 17403 ====
    @FIX: jobs with status "registering" appears when submissions are rejected due to incorrect requirements specifications

    ZD: 16408
    JIRA: QUBE-2034

    ==== CL 17402 ====
    @FIX: intermittent bug where some supe threads won't properly read the supervisor license key from qb.lic

    * add warning message to print to supelog when the license file reader
    returns zero-length data

    ZD: 16828
    JIRA: QUBE-2317

    ==== CL 17390 ====
    @FIX: post-flight should only be run when qbreportwork() is invoked with an agenda-item with terminal-state

    JIRA: QUBE-2032
    ZD: 16412

    ==== CL 17376 ====
    @FIX: Triggers incorrectly executing multiple times

    When a composite (i.e, using && or ||) trigger is specified for a job's callback, such as "done-job-job1 && done-job-job2",
    the callback would erroneously get run multiple times.

    ZD: 16282
    JIRA: QUBE-1881

    ==== CL 17369 ====
    @FIX: issue introduced in 6.9 where requestwork() jobtype backend routine will crash when frame padding is 40 or greater.

    Python jobtype backend, in particular, was found to crash during a call to
    the API routine qb.requestwork(), with a "*** stack smashing detected ***:"
    error message and a backtrace.

    ZD: 16759
    JIRA: QUBE-2318

    ==== CL 17290 ====
    @TWEAK: license-reading routine prints the total license count to the supelog

    JIRA: QUBE-2003

    ==== CL 17289 ====
    @TWEAK: "ping" handler to print out more info to supelog

    Every "qbping" will print out something like the following supelog now:

    [Nov 18, 2016 16:25:55] shinyambp[11662]: INFO: responded to ping request from [127.0.0.1]: 6.9-0 bld-custom osx - - host - 0/11 unlimited licenses (metered=0/0) - mode=0 (0)

    JIRA: QUBE-2002

    ==== CL 17231 ====
    @FIX: disabled verbose option for logging libcurl actions

    ==== CL 17208 ====
    @CHANGE: Popluate the subjob (instance) objects with more data (like status), and not just the IDs, when subjob info is requested via "qbhostinfo" (qb.hostinfo(subjobs=True) for python API)

    Previously, only jobid, subid, and host info (name, address, macaddress)
    were filled. Now, things like "status", "timestart", "allocations",
    etc. are properly filled in.

    JIRA: QUBE-2073
    ZD: 16541

    ==== CL 17206 ====
    @FIX: When "migrate_on_frame_retry" job flag is set, prevent backend from doing further processing (especially another requestwork()) after a work failed

    This was causing race-conditions that will get agenda items to be stuck in
    "retrying" state, while there are no instances processing them.

    Now the reportwork() API routine is modified so that if it's invoked to
    report that a work "failed", and the "migrate_on_frame_retry" is set on the
    job, it will stop processing (does a long sleep), and let the worker/proxy
    do the process clean up.

    JIRA: QUBE-2202
    ZD: 16553

    ==== CL 17186 ====
    @FIX: "VirtualBox Host-Only Ethernet Adapter" now when daemons (supe, worker) try to pick a primary mac address

    JIRA: QUBE-2149
    ZD: 16561

    ==== CL 17182 ====
    @CHANGE: all classes that inherit from QbObject print as a regular dictionary, no longer have a __repr__ which prints the job data as a single flat string
    @NEW: add qb.validatejob() function to python API, help find malformed jobs that crash the user interfaces

    ==== CL 17141 ====
    @FIX: Any job submitted from within a running job picks up the pgrp of the submitting job

    By design, if the submission environment has QBGRPID and QBJOBID set, the
    API's submission routine will set the job's pgrp and pid, respectively to
    the values specified in the environment variables.

    One couldn't override this "inheritance" behavior even by explicitly
    specifying "pgrp" or "pid" in the job being submitted, for instance with
    the "-pgrp" command-line option of qbsub.

    Fixed, so that setting "pgrp" to 0 on submission means that the job should
    generate its own pgrp instead of inheriting it from the environment.

    JIRA: QUBE-2141
    ZD: 16545

    ==== CL 17101 ====
    @NEW: add "-dying" and "-registering" options to qbjobs.
    @CHANGE: also add dying and registering jobs to the "-active" filter.

    JIRA: QUBE-2091
    ZD: 16469

    ==== CL 17083 ====
    @FIX: Python API: qbping(asDict=True) crashes when used against older (pre-6.9) supe

    Among other things, this was causing WV to crash and AV to note an
    exception (but not crash) when starting up with an older supervisro.

    JIRA: QUBE-2084

     

    ##############################################################################

    @RELEASE: 6.9-0

    ##############################################################################

    ==== CL 16804 ====
    @TWEAK: added code to print what operation was requested, when printing out "permission granted to user..."

    ==== CL 16776 ====
    @FIX: Python API should handle exception for when gethostbyname() doesn't work in mysqlConnect

    JIRA: QUBE-1965

    ==== CL 16770 ====
    @CHANGE: Ensure that the pending reasons returned by qb.hostorder (or qbhostorder command) take metered licensing into account

    JIRA: QUBE-1986

    ==== CL 16696 ====
    @NEW: add supervisor_max_metered_licenses support to qb.conf, which enables site-admins to customize the effective limit of metered licenses that can be used at any given time.

    This number must be smaller than the metered account's limit, or it will be
    capped at the account limit.

    Setting this to 0 effectively disables metered licensing, while setting it
    to -1 (default), allows usage up to the metered account's limit .

    JIRA: QUBE-1867

    ==== CL 16668 ====
    @NEW: made available some frame-padding related environment variables during the execution of job instances and pre/postflights:

    QB_FRAME_PADDING
    QB_PADDED_FRAME_NUMBER
    QB_PADDED_FRAME_START
    QB_PADDED_FRAME_END
    QB_PADDED_FRAME_STEP

    JIRA: QUBE-1841

    ==== CL 16665 ====
    @CHANGE: All "subjob" sections in qbsummary output show "instance" in the title

    @CHANGE: renamed "*vs" options to "*vi" (such as "pvi" or "cvi"). For
    compatibility, the older names still work, just not advertised in the
    "help" output

    @FIX: const-ness of QbString::replacevalue() method

    JIRA: QUBE-1617

    ==== CL 16643 ====
    @FIX: added dependency on mysql-libs (or mariadb-libs) to the supervisor RPM

    JIRA: QUBE-1784

    ==== CL 16642 ====
    @CHANGE: automatic capping of priorities to supervisor_highest_user_priority

    if an ordinary (non-admin) user tries to submit jobs at a higher priority (i.e. lower numerical value) than supervisor_highest_user_priority, the jobs will be accepted but with the priority automatically (and silently, except for a WARNING message in the supelog) capped at supervisor_highest_user_priority

    JIRA: QUBE-1804

    ==== CL 16629 ====
    @CHANGE: "kill work" on a running agenda item will now put the instance processing the agenda item back to "pending", instead of also killing it.

    JIRA: QUBE-627

    ==== CL 16628 ====
    @FIX: "qb_default_string()" warning printed during linux qube-core installation

    Corrected code so that warnings like the following won't print any more:

    WARNING: qb_default_string() unknown value[1001]
    WARNING: qb_default_string() unknown value[1002]

    JIRA: QUBE-1894

    ==== CL 16602 ====
    @FIX: misleading database name printed in error handler for MySQL stored procedures PFX_CALC_CPU_TIME() and PFX_CALC_AVG_WORK_TIME(); "ERROR: TABLE NOT FOUND IN DB pfx_dw.<actual_database_name>"

    ==== CL 16517 ====
    @FIX: C4D appFinder jobs don't apply path translation properly on Windows, backslashes are converted too early

    ==== CL 16407 ====
    @NEW: add SMTP Auth support over SSL and TLS connections.

    @CHANGE:

    * add new mail config qb.conf parameters: mail_user, mail_password, mail_connection_type

    * modified mail_port to be 0 by default, which means use the standard port depending on connection type: 25, 465 (SSL), or 587 (TLS)

    ==== CL 16389 ====
    @FIX: calls to qb.reportwork that happen very close together can cause the supervisor to deadlock on a single frame's status

    ==== CL 16379 ====
    @FIX: case-insensitive parsing of template names in qbwrk.conf when listed for template inheritance

    The following now works (hostA will be in the "big" group):

    [BigNode]
    worker_groups = "big"

    [hostA] : bignode


    JIRA: QUBE-1809

    ==== CL 16369 ====
    @FIX: don't mark the instance as failed if there is one more command to run, the child process has already exited, and the command is sys.exit(0); happens when maya is shut down with its native quit() function.

    ==== CL 16338 ====
    @CHANGE: database checks script splits logging levels between stdout and stderr

    ==== CL 16308 ====
    @CHANGE: fixed every reference to "subjob" to "instance"

    JIRA: QUBE-1768

    ==== CL 16303 ====
    @CHANGE: add supervisor mode settings (such as "disable_metered") to display in qbping output, and be returned in the qb.ping(asDict=True) Pyhon API invocation

    JIRA: QUBE-1759

    ==== CL 16286 ====
    @FIX: checkDiskUsage fails when --mysql option is used and root can't authenticate

    ==== CL 16269 ====
    @FIX: properly support timeouts on socket connections

    @NEW: add "-timeout N" option to the qbping command, and the API qbping(), qbworkerping(), and qbhostping() API routines now honor the timeout set via "qbsettimeout()".

    QUBE-1746

    ==== CL 16266 ====
    @NEW: a new command-line utility for performing both database health checks and data integrity checks

    ==== CL 16247 ====
    @FIX: fixed qb.workid() in callbacks to return the correct workid of the current callback context (it had been always returning None)

    Also changed qb.jobstatus(), workstatus(), and subjobstatus() so that, if
    invoked in a callback giving no args (like a jobid and workid or subjobid),
    they return the status of the respective thing (job, work, or subjob) of
    the current callback context.

    JIRA: QUBE-1763
    ZD: 16105

    ==== CL 16235 ====
    @FIX: a problem with the filtering added to avoid jobs with an ID of 0, in CL15821

    This was causing preemption to not function in many cases.

    ZD: 16006

    ==== CL 16229 ====
    @FIX: On Windows, daemons (supe, worker) now ignore VMWare Virtual Ethernet Adapters when trying to pick a primary mac address (QbConnection.cpp) for the host, which is used to uniquely identify hosts

    ZD: 14481

    ==== CL 16214 ====
    @FIX: aerender AppFinder mangling first path conversion on Windows when using UNC

    ==== CL 16177 ====
    @NEW: add metered_max and metered_used fields to the dict returned by qb.ping(asDict=True)

    JIRA: QUBE-1745

    ==== CL 16145 ====
    @NEW: add support for Metered Licensing

    ==== CL 16139 ====
    @FIX: Fixed the duplicate instance of "stop_activity" (i.e., it was listed twice), to "enforce_password" in qb_supervisor_mode_flag_string(), which was causing string to int conversion of the mode flags to be incorrect

    ==== CL 16064 ====
    @FIX: when job 'dev' attribute True, printing the job package with regex_errors causes the logParser to generate a false positive for the regex_errors match

    ==== CL 16049 ====
    @NEW: add 'outputPath match required' to python-based jobs, frame/work is failed if no match is found

    ==== CL 15974 ====
    @CHANGE: add support for "-conf PATH" to specify qb.conf for worker (phase 1)

    QUBE-253

    ==== CL 15970 ====
    @FIX: modified (un)install_supervisor scripts to properly support CentOS/RHEL 7+ with mariadb and systemd.

    Also modified configure_mysql script (for Linux) to be able to detect the
    version of mysql installed on the system, even when the server is not
    running

    QUBE-1663

    ==== CL 15964 ====
    @NEW: changes to code that generates/modifies my.cnf

    @CHANGE: some refactoring of the configure_mysql script (run on linux on
    (un)installation of the supervisor to modify my.cnf.

    @NEW: make sure "default-storage-engine=MyISAM" is set on Linux too

    @NEW: add "query_cache_type=0" to my.cnf on all platforms

    JIRA: QUBE-1663

    ==== CL 15960 ====
    @FIX: jobs submitted with pgrp set to a (null) string end up having a pgrp of 0

    JIRA: QUBE-1668

    ==== CL 15957 ====
    @FIX: use of single-quotes in job dependency "info-*" syntax results in hung job instances

    JIRA: QUBE-1571

    ==== CL 15947 ====
    @CHANGE: adding "default-storage-engine=MYISAM" to the my.cnf generated for Linux/OSX supe installations

    JIRA: QUBE-1663

    ==== CL 15936 ====
    @CHANGE: add InnoDB to MyISAM conversion code in upgrade_supervisor program for all "qube" tables

    JIRA: QUBE-1664

    ==== CL 15909 ====
    @CHANGE: change flaw in auto-wrangling logic in which it sometimes won't detect a bad worker, and allows it to fail many job agendas.

    When a single job instance/worker has failed all of its assigned frames (at
    least aw_activation_work_count frames) for a job, while other workers are
    still processing their first frame (i.e., no other worker/instance has
    finished a frame), the system deems this worker "bad", locks it, and
    migrates the failed frames and instance, and notify the admin.

    JIRA: QUBE-1475
    ZD: 15219

    ==== CL 15865 ====
    @CHANGE: Made section headers (such as "[default]" or "[node[001-199]]") case-insensitive in config files such as qbwrk.conf

    JIRA: QUBE-1356

    ==== CL 15821 ====
    @FIX: add code to the DB routines and doPreemption() routine to silently ignore job records with job ID of 0 (likely due to corrupt DB records), which was spewing out many warning messages into the supelog

    ZD:15739

    ==== CL 15809 ====
    @FIX: backslashed characters in VRED jobs get treated as escape characters

    ==== CL 15700 ====
    @NEW: add "--conf filename" option to supervisor to specify an alternate location and name for the qb.conf file

    JIRA: QUBE-253

    ==== CL 15673 ====
    @FIX: orphaned job processes left behind on Windows workers, especially when the proxy.exe program dies unexpectedly

    ZD: 15518

    ==== CL 15653 ====
    @FIX: setting jobss "pgrp" value prior to submission is ignored for all but the first job when submitting a list of jobs via a single call to the qbsubmit() API routine

    JIRA: QUBE-1536
    ZD: 15528

    ==== CL 15650 ====
    @FIX: Explicitly setting "host.memory" in worker_resources broken on Linux

    ZD: 15505
    JIRA: QUBE-1531

    ==== CL 15642 ====
    @FIX: Unix (Linux/OSX) workers, when running a cleanup process for a teminating job instance (via removeJob()), would sometimes inadvertently kill processes belonging to other job instances, due to process IDs once owned by the terminating job being reused by the system.

    ZD: 15548

    ==== CL 15567 ====
    @FIX: supervisor_default_max_cpus value was not being applied properly

    ZD: 15503
    JIRA: QUBE-1528

    ==== CL 15560 ====
    @CHANGE: "modify" operation will print, into the supelog and the job's .hst file, the values of the newly modified parameters

    JIRA: QUBE-1318
    ZD: 14979

    ==== CL 15531 ====
    @NEW: add run_program_and_convert_encoding.pl script, which is a wrapper to run any given program and convert its stdout from and to specified encodings (like UTF-16le to UTF-8).

    Added to support 3dsmax batch (i.e., "cmdrange") submissions.

    JIRA: QUBE-1210

    ==== CL 15462 ====
    @FIX: removed submission-time check for jobtype existence on the farm, as it was causing false negatives in certain cases and disallowing submissions

    ZD: 15328, 15831

    ==== CL 15423 ====
    @FIX: KeyError: "regex_outputPaths" is raised when min file size check is specifiec, but no outputPath regular expression is defined

    ==== CL 15384 ====
    @NEW: add Mac OS X 10.11, aka "El Capitan" support

    ==== CL 15380 ====
    @CHANGE: modification now allowed on "done" jobs

    ZD: 15281

    ==== CL 15351 ====
    @FIX: Windows issue where wireless network interfaces are ignored when licenses are verified, causing license keys bound to such interfaces to not work.

    ==== CL 15347 ====
    @FIX: Windows issue where wireless network interfaces are ignored when licenses are verified, causing license keys bound to such interfaces to not work.

    ==== CL 15324 ====
    @CHANGE: supervisor on Win32 to build against Perl 5.8 (upgraded from 5.6) to avoid build issues on new build platform.

    • No labels