JasonWang

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 1142dc4c4e Delete .circleci directory (#1366) Summary: Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1366 Reviewed By: jailby Differential Revision: D54850920 Pulled By: bigfootjon fbshipit-source-id: 9a3eec7b7cb42335a786fb247cb16be9ed3c2d59

2 سال پیش

JasonWang synced commits to bigfootjon-patch-1 at JasonWang/fastText from mirror

  • 61d8afd17a Delete .circleci directory
  • ae1fe80e9f deeplearning/fastText 2/2 Reviewed By: azad-meta Differential Revision: D53908330 fbshipit-source-id: b2215f0522c32a82cd876633210befefe9317d76
  • b733943e84 Predict 1.9-4.2x faster (#1341) Summary: I made prediction 1.9x to 4.2x faster than before. # Motivation I want to use https://tinyurl.com/nllblid218e and similarly parametrized models to run language classification on petabytes of web data. # Methodology The costliest operation is summing the rows for each model input. I've optimized this in three ways: 1. `addRowToVector` was a virtual function call for each row. I've replaced this with one virtual function call per prediction by adding `averageRowsToVector` to `Matrix` calls. 2. `Vector` and `DenseMatrix` were not 64-byte aligned so the CPU was doing a lot of unaligned memory access. I've brought in my own `vector` replacement that does 64-byte alignment. 3. Write the `averageRowsToVector` in intrinsics for common vector sizes. This works on SSE, AVX, and AVX512F. See the commit history for a breakdown of speed improvement from each change. # Experiments Test set [docs1000.txt.gz](https://github.com/facebookresearch/fastText/files/11832996/docs1000.txt.gz) which is a bunch of random documents https://data.statmt.org/heafield/classified-fasttext/ CPU: AMD Ryzen 9 7950X 16-Core Model https://tinyurl.com/nllblid218e with 256-dimensional vectors Before real 0m8.757s user 0m8.434s sys 0m0.327s After real 0m2.046s user 0m1.717s sys 0m0.334s Model https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin with 16-dimensional vectors Before real 0m0.926s user 0m0.889s sys 0m0.037s After real 0m0.477s user 0m0.436s sys 0m0.040s Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1341 Reviewed By: graemenail Differential Revision: D52134736 Pulled By: kpuatfb fbshipit-source-id: 42067161f4c968c34612934b48a562399a267f3b
  • 6c2204ba66 fix compile error with gcc13 #1281 (#1340) Summary: Due to[ header dependency changes](https://gcc.gnu.org/gcc-13/porting_to.html#header-dep-changes) in GCC 13, we need to include the <cstdint> header. Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1340 Reviewed By: jmp84 Differential Revision: D51602433 Pulled By: alexkosau fbshipit-source-id: cc9bffb276cb00f1db8ec97a36784c484ae4563a
  • de458ddea4 Add pyproject.toml to comply with PEP-518 (#1292) Summary: Adds pyproject.toml to comply with PEP-518, which fixes the building of the library by poetry - See https://github.com/python-poetry/poetry/issues/6113 . This is a copy of https://github.com/facebookresearch/fastText/pull/1270 , but I have signed the CLA. Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1292 Differential Revision: D51601444 Pulled By: alexkosau fbshipit-source-id: 357d702281ca3519c3640483eba04d124d0744b4
  • مشاهده مقایسه برای کامیت 10 »

2 سال پیش

JasonWang synced new reference bigfootjon-patch-1 to JasonWang/fastText from mirror

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • ae1fe80e9f deeplearning/fastText 2/2 Reviewed By: azad-meta Differential Revision: D53908330 fbshipit-source-id: b2215f0522c32a82cd876633210befefe9317d76

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • b733943e84 Predict 1.9-4.2x faster (#1341) Summary: I made prediction 1.9x to 4.2x faster than before. # Motivation I want to use https://tinyurl.com/nllblid218e and similarly parametrized models to run language classification on petabytes of web data. # Methodology The costliest operation is summing the rows for each model input. I've optimized this in three ways: 1. `addRowToVector` was a virtual function call for each row. I've replaced this with one virtual function call per prediction by adding `averageRowsToVector` to `Matrix` calls. 2. `Vector` and `DenseMatrix` were not 64-byte aligned so the CPU was doing a lot of unaligned memory access. I've brought in my own `vector` replacement that does 64-byte alignment. 3. Write the `averageRowsToVector` in intrinsics for common vector sizes. This works on SSE, AVX, and AVX512F. See the commit history for a breakdown of speed improvement from each change. # Experiments Test set [docs1000.txt.gz](https://github.com/facebookresearch/fastText/files/11832996/docs1000.txt.gz) which is a bunch of random documents https://data.statmt.org/heafield/classified-fasttext/ CPU: AMD Ryzen 9 7950X 16-Core Model https://tinyurl.com/nllblid218e with 256-dimensional vectors Before real 0m8.757s user 0m8.434s sys 0m0.327s After real 0m2.046s user 0m1.717s sys 0m0.334s Model https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin with 16-dimensional vectors Before real 0m0.926s user 0m0.889s sys 0m0.037s After real 0m0.477s user 0m0.436s sys 0m0.040s Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1341 Reviewed By: graemenail Differential Revision: D52134736 Pulled By: kpuatfb fbshipit-source-id: 42067161f4c968c34612934b48a562399a267f3b

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 6c2204ba66 fix compile error with gcc13 #1281 (#1340) Summary: Due to[ header dependency changes](https://gcc.gnu.org/gcc-13/porting_to.html#header-dep-changes) in GCC 13, we need to include the <cstdint> header. Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1340 Reviewed By: jmp84 Differential Revision: D51602433 Pulled By: alexkosau fbshipit-source-id: cc9bffb276cb00f1db8ec97a36784c484ae4563a
  • de458ddea4 Add pyproject.toml to comply with PEP-518 (#1292) Summary: Adds pyproject.toml to comply with PEP-518, which fixes the building of the library by poetry - See https://github.com/python-poetry/poetry/issues/6113 . This is a copy of https://github.com/facebookresearch/fastText/pull/1270 , but I have signed the CLA. Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1292 Differential Revision: D51601444 Pulled By: alexkosau fbshipit-source-id: 357d702281ca3519c3640483eba04d124d0744b4
  • مشاهده مقایسه برای کامیت 2 »

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 789e32860d deeplearning, dcp (2972240286315620591) Reviewed By: r-barnes Differential Revision: D49677606 fbshipit-source-id: ec5b375177586c76ecccb83a29b562bc6e9961f6

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 166ce2c71a Del `(object)` from 200 inc deeplearning/aicamera/trainer/utils/metrics.py Summary: Python3 makes the use of `(object)` in class inheritance unnecessary. Let's modernize our code by eliminating this. Reviewed By: itamaro Differential Revision: D48673901 fbshipit-source-id: 3e0ef05efe886b32a07bb58bd0725fa2ec934c14

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 440f46ac88 Migrate "deeplearning/fastText" from LLVM-12 to LLVM-15 Summary: fbcode is migrating to LLVM-15 for safer and more up-to-date code and new compiler features. All contbuilds in your directory have passed our build test with LLVM-15, and your directory does not host any packages. This diff will migrate it to LLVM-15. If you approve of this diff, please use the "Accept & Ship" button. If you have a reason for why it should not build with LLVM 15, please make a comment and send it back to author. Otherwise we will land this on Thursday 06/15/2023. See the [FAQ post](https://fb.workplace.com/groups/llvm15platform010/posts/749154386769776/)! Please also direct any questions to [this group](https://fb.workplace.com/groups/llvm15platform010). - If you approve of this diff, please use the "Accept & Ship" button :-) Reviewed By: meyering Differential Revision: D46661531 fbshipit-source-id: 7278fbfcadec2392c94efd6deb710bdd5e9280f8

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 48171ac647 Add documentation about Hugging Face integration (#1335) Summary: [Word vectors](https://huggingface.co/facebook/fasttext-en-vectors) for 157 languages are now hosted on the Hugging Face Hub as well as the [language identification model](https://huggingface.co/facebook/fasttext-language-identification). (cc ajoulin) A newer language model [referred in the NLLB project](https://github.com/facebookresearch/fairseq/blob/nllb/README.md#lid-model) is not mentioned in the official website, so I updated the doc accordingly. Pull Request resolved: https://github.com/facebookresearch/fastText/pull/1335 Reviewed By: Celebio Differential Revision: D46507563 Pulled By: jmp84 fbshipit-source-id: 64883a6829c68b968acd980ba77a712b8e7a1365

2 سال پیش

JasonWang synced commits to main at JasonWang/fastText from mirror

  • 0622aad857 Replace outdated url in the scripts Summary: Replace outdated url in the scripts Reviewed By: piotr-bojanowski Differential Revision: D43464784 fbshipit-source-id: 51a98a9ad5a0939acd0d578126290909a613938b

2 سال پیش

JasonWang synced commits to dependabot/nuget/tests/UnitTest/Newtonsoft.Json-13.0.2 at JasonWang/MineCase from mirror

  • 5a6fc00d3d Bump Newtonsoft.Json from 12.0.3 to 13.0.2 in /tests/UnitTest Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.2. - [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases) - [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/12.0.3...13.0.2) --- updated-dependencies: - dependency-name: Newtonsoft.Json dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
  • 87863ae1ea Update packages and apply source genenerator (#141) * Update to net5.0 * Use generic host in server * Add code gen * Add serialize codegen * Complete protocol refactor * bug fix * Move config loader * Update package sources * Update set compression * Add azure pipelines * Set up CI with Azure Pipelines [skip ci] * Update azure-pipelines.yml for Azure Pipelines * Replace multiple frameworks Co-authored-by: JasonWang <[email protected]>
  • 078275e378 Added financial contributors to the README
  • 8512935493 position read write bug
  • 519a4cd785 whatever
  • مشاهده مقایسه برای کامیت 10 »

3 سال پیش

JasonWang synced new reference dependabot/nuget/tests/UnitTest/Newtonsoft.Json-13.0.2 to JasonWang/MineCase from mirror

3 سال پیش

JasonWang synced commits to dependabot/nuget/src/MineCase.Server/Newtonsoft.Json-13.0.2 at JasonWang/MineCase from mirror

  • 6ce817257c Bump Newtonsoft.Json from 12.0.3 to 13.0.2 in /src/MineCase.Server Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.2. - [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases) - [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/12.0.3...13.0.2) --- updated-dependencies: - dependency-name: Newtonsoft.Json dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
  • 87863ae1ea Update packages and apply source genenerator (#141) * Update to net5.0 * Use generic host in server * Add code gen * Add serialize codegen * Complete protocol refactor * bug fix * Move config loader * Update package sources * Update set compression * Add azure pipelines * Set up CI with Azure Pipelines [skip ci] * Update azure-pipelines.yml for Azure Pipelines * Replace multiple frameworks Co-authored-by: JasonWang <[email protected]>
  • 078275e378 Added financial contributors to the README
  • 8512935493 position read write bug
  • 519a4cd785 whatever
  • مشاهده مقایسه برای کامیت 10 »

3 سال پیش

JasonWang synced commits to dependabot/nuget/src/MineCase.Server.Grains/Newtonsoft.Json-13.0.2 at JasonWang/MineCase from mirror

  • 05cb947c89 Bump Newtonsoft.Json in /src/MineCase.Server.Grains Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.2. - [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases) - [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/12.0.3...13.0.2) --- updated-dependencies: - dependency-name: Newtonsoft.Json dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
  • 87863ae1ea Update packages and apply source genenerator (#141) * Update to net5.0 * Use generic host in server * Add code gen * Add serialize codegen * Complete protocol refactor * bug fix * Move config loader * Update package sources * Update set compression * Add azure pipelines * Set up CI with Azure Pipelines [skip ci] * Update azure-pipelines.yml for Azure Pipelines * Replace multiple frameworks Co-authored-by: JasonWang <[email protected]>
  • 078275e378 Added financial contributors to the README
  • 8512935493 position read write bug
  • 519a4cd785 whatever
  • مشاهده مقایسه برای کامیت 10 »

3 سال پیش

JasonWang synced commits to dependabot/nuget/src/MineCase.Protocol/Newtonsoft.Json-13.0.2 at JasonWang/MineCase from mirror

  • 2d1c84a107 Bump Newtonsoft.Json from 12.0.3 to 13.0.2 in /src/MineCase.Protocol Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.2. - [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases) - [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/12.0.3...13.0.2) --- updated-dependencies: - dependency-name: Newtonsoft.Json dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
  • 87863ae1ea Update packages and apply source genenerator (#141) * Update to net5.0 * Use generic host in server * Add code gen * Add serialize codegen * Complete protocol refactor * bug fix * Move config loader * Update package sources * Update set compression * Add azure pipelines * Set up CI with Azure Pipelines [skip ci] * Update azure-pipelines.yml for Azure Pipelines * Replace multiple frameworks Co-authored-by: JasonWang <[email protected]>
  • 078275e378 Added financial contributors to the README
  • 8512935493 position read write bug
  • 519a4cd785 whatever
  • مشاهده مقایسه برای کامیت 10 »

3 سال پیش

JasonWang synced commits to dependabot/nuget/src/MineCase.Core/Newtonsoft.Json-13.0.2 at JasonWang/MineCase from mirror

  • 0715debea3 Bump Newtonsoft.Json from 12.0.3 to 13.0.2 in /src/MineCase.Core Bumps [Newtonsoft.Json](https://github.com/JamesNK/Newtonsoft.Json) from 12.0.3 to 13.0.2. - [Release notes](https://github.com/JamesNK/Newtonsoft.Json/releases) - [Commits](https://github.com/JamesNK/Newtonsoft.Json/compare/12.0.3...13.0.2) --- updated-dependencies: - dependency-name: Newtonsoft.Json dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
  • 87863ae1ea Update packages and apply source genenerator (#141) * Update to net5.0 * Use generic host in server * Add code gen * Add serialize codegen * Complete protocol refactor * bug fix * Move config loader * Update package sources * Update set compression * Add azure pipelines * Set up CI with Azure Pipelines [skip ci] * Update azure-pipelines.yml for Azure Pipelines * Replace multiple frameworks Co-authored-by: JasonWang <[email protected]>
  • 078275e378 Added financial contributors to the README
  • 8512935493 position read write bug
  • 519a4cd785 whatever
  • مشاهده مقایسه برای کامیت 10 »

3 سال پیش