Skip to content

Releases: Qihoo360/hbox

v1.9.5

01 Jul 08:22
Compare
Choose a tag to compare

Bug Fixes

  • Fix configuration file reading order

Full Changelog: v1.9.4...v1.9.5

v1.9.4

26 Jun 08:02
Compare
Choose a tag to compare

Bug Fixes

  • Fix error of HBOX_CLIENT_OPTS undefined

Full Changelog: v1.9.3...v1.9.4

v1.9.3

26 Jun 03:29
Compare
Choose a tag to compare

Bug Fixes and Other Changes

  • Fix cli parsing for --conf, --intput and --output
  • Avoid RunJar to unjar the client into a temp dir
  • Open HBOX_CLIENT_OPTS flags

Full Changelog: v1.9.2...v1.9.3

v1.9.2

23 May 03:37
Compare
Choose a tag to compare

Bug Fixes and Other Changes

  • Fix missing mpi logs.
  • Add role and rank in job history page.

Full Changelog: v1.9.1...v1.9.2

v1.9.1

15 May 03:33
Compare
Choose a tag to compare

Bug Fixes and Other Changes

  • support switch submitting user in azkaban.

Full Changelog: v1.9.0...v1.9.1

v1.9.0

13 May 08:07
Compare
Choose a tag to compare

Major Features And Improvements

  • support full mobile deployment of OpenMPI on AM.
  • format Java code using palantir-java-format.
  • support emitting opentelemetry traces.

Bug Fixes and Other Changes

  • fix hbox-logs issue and improve MPI orted status monitoring.
  • fix PATH and LD_LIBRARY_PATH for mpi jobs .
  • fix hadoop ugi and shellcheck error.

Full Changelog: v1.8.1...v1.9.0

v1.8.0

01 Jul 03:04
Compare
Choose a tag to compare

What's Changed

  • Support GPU training
  • Support multiple version of yarn (>= 2.6)
  • add faq by @yuyajian in #15
  • fix the board history display by @jiarunying in #17
  • reset worker num when less than inputfile number by @liyuance in #35
  • xlearning container allocate port from 20000 to 30000 by default by @FANNG1 in #45
  • add docker support by @SuperbDong in #64
  • Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

Full Changelog: v1.7.2...v1.8.0

What's Changed

  • Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

Full Changelog: v1.4...v1.8.0

v1.8.0-beta2

24 Jun 10:46
Compare
Choose a tag to compare
v1.8.0-beta2 Pre-release
Pre-release

What's Changed

  • add faq by @yuyajian in #15
  • fix the board history display by @jiarunying in #17
  • reset worker num when less than inputfile number by @liyuance in #35
  • xlearning container allocate port from 20000 to 30000 by default by @FANNG1 in #45
  • add docker support by @SuperbDong in #64
  • Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

Full Changelog: https://github.com/Qihoo360/hbox/commits/v1.8.0-beta2

XLearning 1.4

16 May 08:29
Compare
Choose a tag to compare

Release XLearning 1.4

Major Features And Improvements

  • Support the application running on the docker
  • Support the mpi application
  • ClusterDef is avaliable for TensorFlow Distribution Strategy API
  • Allow the amount of memory to be set separately for chief and estimator worker for TensorFlow Application
  • Specify the Yarn node label for job execution
  • Multi-threads upload the output
  • Allow the inter-result incremental upload
  • Support the regular matching for input path

Bug Fixes and Other Changes

  • The memory usage adjustment prompt is only displayed when the application finish status is successed.

XLearning 1.3

16 Oct 11:03
Compare
Choose a tag to compare

Release XLearning 1.3

Major Features And Improvements

  • Support the lightLDA, see examples/lightLDA for use
  • Support the xflow, see examples/xflow for use
  • By submitting the configuration parameter to support the user-defined environment variable settings
  • Setting the last worker as estimator role of the distribute TensorFlow application if the user set the tf-evaluator as true, see examples/tfEstimators for use
  • Define the single worker index to save the output by set the output-index
  • Port reservation mechanism optimization
  • Local data container allocation priority mechanism
  • Display resource application and usage information
  • ps role function expansion: more convenient metrics use information rendering and output output upload

Bug Fixes and Other Changes

  • Container waits for the remaining machine port addresses to be stuck in the process due to the failure of the Container in distributed mode
  • After the worker applies, the number of redundant applications is released, and the remove request operation is added
  • Application failed due to excessive environment variables too long of the input in PLACEHOLDER mode
  • Job execution judgment failure condition control
  • The status code returns incorrectly when the Container successfully exits