Deep Learning Chinese Word Segment

Overview

引用 

  本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098

构建

  1. 安装好bazel代码构建工具,安装好tensorflow(目前本项目需要tf 1.0.0alpha版本以上)

  2. 切换到本项目代码目录,运行./configure

  3. 编译后台服务

    bazel build //kcws/cc:seg_backend_api

训练

  1. 关注待字闺中公众号 回复 kcws 获取语料下载地址:

    logo

  2. 解压语料到一个目录

  3. 切换到代码目录,运行:

python kcws/train/process_anno_file.py <语料目录> pre_chars_for_w2v.txt

bazel build third_party/word2vec:word2vec

先得到初步词表

./bazel-bin/third_party/word2vec/word2vec -train pre_chars_for_w2v.txt -save-vocab pre_vocab.txt -min-count 3

处理低频词   python kcws/train/replace_unk.py pre_vocab.txt pre_chars_for_w2v.txt chars_for_w2v.txt

训练word2vec

./bazel-bin/third_party/word2vec/word2vec -train chars_for_w2v.txt -output vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5

构建训练语料工具

bazel build kcws/train:generate_training

生成语料

./bazel-bin/kcws/train/generate_training vec.txt <语料目录> all.txt

得到train.txt , test.txt文件

python kcws/train/filter_sentence.py all.txt

  1. 安装好tensorflow,切换到kcws代码目录,运行:

python kcws/train/train_cws.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001  (默认使用IDCNN模型,可设置参数”--use_idcnn False“来切换BiLSTM模型)

  1. 生成vocab

bazel build kcws/cc:dump_vocab

./bazel-bin/kcws/cc/dump_vocab vec.txt kcws/models/basic_vocab.txt

  1. 导出训练好的模型

python tools/freeze_graph.py --input_graph logs/graph.pbtxt --input_checkpoint logs/model.ckpt --output_node_names "transitions,Reshape_7" --output_graph kcws/models/seg_model.pbtxt

  1. 词性标注模型下载 (临时方案,后续文档给出词性标注模型训练,导出等)

    https://pan.baidu.com/s/1bYmABk 下载pos_model.pbtxt到kcws/models/目录下

  2. 运行web service

./bazel-bin/kcws/cc/seg_backend_api --model_path=kcws/models/seg_model.pbtxt(绝对路径到seg_model.pbtxt>) --vocab_path=kcws/models/basic_vocab.txt --max_sentence_len=80

词性标注的训练说明:

https://github.com/koth/kcws/blob/master/pos_train.md

自定义词典

目前支持自定义词典是在解码阶段,参考具体使用方式请参考kcws/cc/test_seg.cc 字典为文本格式,每一行格式如下:

<自定义词条>\t<权重>

比如:

蓝瘦香菇 4

权重为一个正整数,一般4以上,越大越重要

demo

http://45.32.100.248:9090/

附: 使用相同模型训练的公司名识别demo:

http://45.32.100.248:18080

Comments
  • 大神,bazel build //kcws/cc:seg_backend_api 报错

    大神,bazel build //kcws/cc:seg_backend_api 报错

    ERROR: /root/kcws/third_party/gflags/BUILD:12:1: Executing genrule //third_party/gflags:gflags-srcs failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 77.

    opened by maczhao 15
  • ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.

    ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.

    Hi, when I build the kcws, there are some issues, how can I fix them?

    the issues are as follow:

    [[email protected] cc]# /opt/BioDir/dl/bazel-0.4.3/output/bazel build //kcws/cc:seg_backend_api WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.build/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing. WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:13:5: path_prefix was specified to tf_workspace but is no longer used and will be removed in the future. WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:15:5: tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future. ERROR: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/core/platform/default/build_config/BUILD:108:1: error loading package '@jpeg//': Extension file not found. Unable to load package for '//third_party:common.bzl': BUILD file not found on package path and referenced by '@org_tensorflow//tensorflow/core/platform/default/build_config:jpeg'. ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted. INFO: Elapsed time: 2.612s

    ================= I build the bazel tools as follow:

    [[email protected] bazel-0.4.3]# bash ./compile.sh INFO: You can skip this first step by providing a path to the bazel binary as second argument: INFO: ./compile.sh compile /path/to/bazel  Building Bazel from scratch.......  Building Bazel with Bazel. .WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions. INFO: Found 1 target... INFO: From Compiling third_party/ijar/platform_utils.cc [for host]: third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)': third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (write(fd, data, size) != size) { ^ INFO: From Compiling third_party/ijar/platform_utils.cc: third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)': third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (write(fd, data, size) != size) { ^ INFO: From Compiling third_party/ijar/ijar.cc: third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)': third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (filename_len >= CLASS_EXTENSION_LENGTH) { ^ INFO: From Compiling third_party/ijar/ijar.cc [for host]: third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)': third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (filename_len >= CLASS_EXTENSION_LENGTH) { ^ INFO: From Compiling src/main/cpp/blaze_util_posix.cc: src/main/cpp/blaze_util_posix.cc: In function 'void blaze::Daemonize(const string&)': src/main/cpp/blaze_util_posix.cc:190:28: warning: ignoring return value of 'int dup(int)', declared with attribute warn_unused_result [-Wunused-result] (void) dup(STDOUT_FILENO); // stderr (2>&1) ^ src/main/cpp/blaze_util_posix.cc: In function 'uint64_t blaze::AcquireLock(const string&, bool, bool, blaze::BlazeLock*)': src/main/cpp/blaze_util_posix.cc:578:30: warning: ignoring return value of 'int ftruncate(int, __off_t)', declared with attribute warn_unused_result [-Wunused-result] (void) ftruncate(lockfd, 0); ^ src/main/cpp/blaze_util_posix.cc:583:47: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result] (void) write(lockfd, msg.data(), msg.size()); ^ INFO: From JavacBootstrap src/java_tools/buildjar/java/com/google/devtools/build/buildjar/libbootstrap_JarOwner.jar [for host]: warning: Implicitly compiled files were not subject to annotation processing. Use -proc:none to disable annotation processing or -implicit to specify a policy for implicit compilation. 1 warning INFO: From Building src/main/protobuf/libextra_actions_base_java_proto.jar (1 source jar): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage.jar (9 source files): Note: src/java_tools/junitrunner/java/com/google/testing/coverage/MethodProbesMapper.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/tools/android/java/com/google/devtools/build/android/ziputils/libziputils_lib.jar (12 source files): Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libconcurrent.jar (18 source files): Note: src/main/java/com/google/devtools/build/lib/concurrent/AbstractQueueVisitor.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building third_party/java/apkbuilder/apkbuilder.jar (15 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libutil.jar (45 source files): Note: src/main/java/com/google/devtools/build/lib/util/OrderedSetMultimap.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/cmdline/libcmdline.jar (10 source files): Note: src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/skyframe/libskyframe.jar (67 source files): Note: src/main/java/com/google/devtools/build/skyframe/ReverseDepsUtilImpl.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libsyntax.jar (86 source files): Note: src/main/java/com/google/devtools/build/lib/syntax/BuiltinFunction.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libpackages-internal.jar (98 source files): Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/actions/libactions.jar (91 source files): Note: src/main/java/com/google/devtools/build/lib/actions/Actions.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libbuild-base.jar (381 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libproto-rules.jar (13 source files): Note: src/main/java/com/google/devtools/build/lib/rules/proto/ProtoCommon.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery2.jar (12 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery-output.jar (10 source files): Note: src/main/java/com/google/devtools/build/lib/query2/output/QueryOutputUtils.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/genquery/libgenquery.jar (2 source files): Note: src/main/java/com/google/devtools/build/lib/rules/genquery/GenQuery.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/cpp/libcpp.jar (80 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libpython-rules.jar (15 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-compilation.jar (37 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileAction.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-rules.jar (32 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libandroid-rules.jar (59 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libideinfo.jar (4 source files): Note: src/main/java/com/google/devtools/build/lib/ideinfo/AndroidStudioInfoAspect.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/rules/objc/libobjc.jar (114 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: src/main/java/com/google/devtools/build/lib/rules/objc/IterableWrapper.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libruntime.jar (94 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/sandbox/libsandbox.jar (16 source files): Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/worker/libworker.jar (11 source files): Note: src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnStrategy.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. INFO: From Building src/main/java/com/google/devtools/build/lib/libbazel-rules.jar (87 source files, 14 resources): Note: src/main/java/com/google/devtools/build/lib/bazel/rules/java/BazelJavaSemantics.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Target //src:bazel up-to-date: bazel-bin/src/bazel INFO: Elapsed time: 178.725s, Critical Path: 170.17s WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions.

    Build successful! Binary is here: /opt/BioDir/dl/bazel-0.4.3/output/bazel

    opened by Sun-shan 12
  • error when run bazel build //kcws/cc:seg_backend_api

    error when run bazel build //kcws/cc:seg_backend_api

    ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Encountered error while reading extension file 'tensorflow/workspace.bzl': no such package '@org_tensorflow//tensorflow': local_repository rule //external:org_tensorflow must specify an existing directory. INFO: Elapsed time: 0.049s

    build on: centos6.8 x64 no gpu support Build label: 0.4.1- (@non-git) tensorflow-0.11.0

    opened by busyfree 11
  • 关于标注部分的问题

    关于标注部分的问题

    大神好,我昨天仔细研究了您新添加的词性标注模块,然后我发现有几步好像有点问题,我自己尝试更改了一下,现在已经跑通了,99.57%的准确率,请您看看,问题如下: 1、在第五步骤,传入参数“lines_withpos.txt”,然而在代码里面并没有写入信息,我觉得应该得在代码里面添加 写入每个标注与其对应的序号。 2、在第六步骤,传入的第三个参数应该是上一步生成的词典“lines_withpos.txt”而不是”pos_vocab.txt“。

    您看这样是正确的吗?

    opened by oneapmlj 7
  • gflags link failed

    gflags link failed

    Linking using thirdparty gflags failed.

    Fixed by using self compiled gflags, maybe version issues of gflag. Modification made to Build files.

    
    --- a/third_party/glog/BUILD
    +++ b/third_party/glog/BUILD
    @@ -45,10 +45,7 @@ cc_library(
             "include/glog/stl_logging.h",
             "include/glog/vlog_is_on.h",
         ],
    -    deps = [
    -      "//third_party/gflags:gflags-cxx",
    -
    -    ],
    +    linkopts = ["-lgflags"],
         hdrs = [
             "include/glog/logging.h",
         ],
    
    opened by Vimos 7
  • 修改了max_word_num 的最大值,运行起来报错

    修改了max_word_num 的最大值,运行起来报错

    koth大大,请教个问题,我修改了 seg_backend_api.cc的 DEFINE_int32(max_word_num, 300, "max num of word per sentence ");将值改到了300,我测试的句子里面的字数比较多,在运行时报以下错误: E0918 11:23:35.434610 26934 tfmodel.cc:88] Error during inference: Invalid argument: Input to reshape is a tensor with 640 values, but the requested shape requires a multiple of 1200 [[Node: Reshape_7 = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _output_shapes=[[?,300,4]], _device="/job:localhost/replica:0/task:0/cpu: 0"](idcnn_1/scores, Reshape_7/shape)]] 2017-09-18 11:23:35.434675: E kcws/cc/tf_seg_model.cc:321] Error during inference:

    这种情况是不是我要重新训练models里面的word_vocab.txt文件?还是什么问题呢?如果是word_vocab.txt的问题,这个文本文件怎么训练呢?谢谢解惑.

    opened by younger911 6
  • F tensorflow/core/platform/cpu_feature_guard.cc:35] The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine.

    F tensorflow/core/platform/cpu_feature_guard.cc:35] The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine.

    [email protected]:/mnt/kcws# export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp27-none-linux_x86_64.whl [email protected]:/mnt/kcws# pip install --upgrade $TF_BINARY_URL

    通过这种安装的tensorflow,可以运行的。 但是这个项目启动会抛这个错误

    opened by weisong82 6
  • 关于默认分词的效果

    关于默认分词的效果

    我按照说明操作后,分词的效果如下。分词效果不是很准,下面是分词结果,这个正常吗? { "msg": "OK", "segments": [ "赵雅", "淇", "洒泪", "道", "歉", " ", "和林", "丹", "没", "有", "任", "何", "经济", "关", "系" ], "status": 0 }

    duplicate 
    opened by dengzz 5
  • embedding_size  AssertionError

    embedding_size AssertionError

    在最后train的时候:也就是运行: python kcws/train/train_cws_lstm.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001

    报错: Traceback (most recent call last): File "kcws/train/train_cws_lstm.py", line 262, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "kcws/train/train_cws_lstm.py", line 228, in main FLAGS.word2vec_path, FLAGS.num_hidden) File "kcws/train/train_cws_lstm.py", line 62, in init self.c2v = self.load_w2v(c2vPath) File "kcws/train/train_cws_lstm.py", line 132, in load_w2v assert (dim == (FLAGS.embedding_size)) AssertionError

    然后修改了:train_cws_lstm.py 的 tf.app.flags.DEFINE_integer("embedding_size", 50, "embedding size")tf.app.flags.DEFINE_integer("embedding_size", 200, "embedding size")就好

    opened by rockyzhengwu 5
  • 词性标注模型最后一步报错 MemoryError

    词性标注模型最后一步报错 MemoryError

    $ python tools/freeze_graph.py --input_graph pos_logs/graph.pbtxt --input_checkpoint pos_logs/model.ckpt --output_node_names "transitions,Reshape_9" --output_graph kcws/models/pos_model.pbtxt Traceback (most recent call last): File "tools/freeze_graph.py", line 202, in app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "tools/freeze_graph.py", line 134, in main FLAGS.variable_names_blacklist) File "tools/freeze_graph.py", line 93, in freeze_graph text_format.Merge(f.read().decode("utf-8"), input_graph_def) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 525, in Merge descriptor_pool=descriptor_pool) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 579, in MergeLines return parser.MergeLines(lines, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 612, in MergeLines self._ParseOrMerge(lines, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 627, in _ParseOrMerge self._MergeField(tokenizer, message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 727, in _MergeField merger(tokenizer, message, field) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 815, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 714, in _MergeField tokenizer.Consume(':') File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1078, in Consume if not self.TryConsume(token): File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1065, in TryConsume self.NextToken() File "/usr/local/lib/python2.7/dist-packages/google/protobuf/text_format.py", line 1314, in NextToken match = self._TOKEN.match(self._current_line, self._column) MemoryError

    opened by kinghuangdd 4
  • 编译后台服务出现新错误。。。

    编译后台服务出现新错误。。。

    您好,执行命令:bazel build //kcws/cc:seg_backend_api 报错如下:

    ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:5:1: Reassignment of builtin build function 'package_name' not permitted. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/glog/BUILD:5:1: Reassignment of builtin build function 'package_name' not permitted. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:empty.cc' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:include/gflags/gflags_declare.h' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:lib/libgflags.a' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/third_party/gflags/BUILD:41:1: Target '//third_party/gflags:include/gflags/gflags.h' contains an error and its package is in error and referenced by '//third_party/gflags:gflags-cxx'. ERROR: /home/di/pycharmProjects/segment/kcws/base/BUILD:3:1: Target '//third_party/gflags:gflags-cxx' contains an error and its package is in error and referenced by '//base:base'. ERROR: /home/di/pycharmProjects/segment/kcws/base/BUILD:3:1: Target '//third_party/glog:glog-cxx' contains an error and its package is in error and referenced by '//base:base'. ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted. INFO: Elapsed time: 0.167s

    执行命令:bazel build third_party/word2vec:word2vec 能成功bazel,其他的命令如:bazel build kcws/train:generate_training,bazel build kcws/cc:dump_vocab均会类似如上错误。在build文件中加了“licenses(["notice"])”依然不行。。。 请问大神这是是什么原因,有空的话能不能帮看一下,不甚感激!

    opened by yufengzhixing 4
  • 编译后台服务报错

    编译后台服务报错

    WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files: /home/cly/github/kcws/tools/bazel.rc INFO: Writing tracer profile to '/home/cly/.cache/bazel/_bazel_cly/271de499a4ab5fb7350261a41335ecd2/command.profile.gz' ERROR: /home/cly/github/kcws/WORKSPACE:5:1: name 'new_http_archive' is not defined ERROR: /home/cly/github/kcws/WORKSPACE:18:1: name 'new_http_archive' is not defined ERROR: /home/cly/github/kcws/WORKSPACE:34:1: name 'http_archive' is not defined ERROR: error loading package '': Encountered error while reading extension file 'tools/build_defs/repo/http.bzl': no such package '@bazel_tools//tools/build_defs/repo': error loading package 'external': Could not load //external package ERROR: error loading package '': Encountered error while reading extension file 'tools/build_defs/repo/http.bzl': no such package '@bazel_tools//tools/build_defs/repo': error loading package 'external': Could not load //external package INFO: Elapsed time: 0.032s INFO: 0 processes. FAILED: Build did NOT complete successfully (0 packages loaded)

    opened by lingyiliu016 2
  • error C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2). cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings”

    error C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2). cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings”

    ERROR: C:/users/thomas/appdata/local/temp/_bazel_thomas/infhcau0/external/protob uf/BUILD:113:1: C++ compilation of rule '@protobuf//:protobuf' failed (Exit 2): cl.exe failed: error executing command cd C:/users/thomas/appdata/local/temp/_bazel_thomas/infhcau0/execroot/main

    SET INCLUDE=F:\Tools\Microsoft Visual Studio 14.0\VC\INCLUDE;F:\Tools\Microsof t Visual Studio 14.0\VC\ATLMFC\INCLUDE;C:\Program Files (x86)\Windows Kits\10\in clude\10.0.14393.0\ucrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\inclu de\um;C:\Program Files (x86)\Windows Kits\10\include\10.0.14393.0\shared;C:\Prog ram Files (x86)\Windows Kits\10\include\10.0.14393.0\um;C:\Program Files (x86)\W indows Kits\10\include\10.0.14393.0\winrt; SET LIB=F:\Tools\Microsoft Visual Studio 14.0\VC\LIB\amd64;F:\Tools\Microsof t Visual Studio 14.0\VC\ATLMFC\LIB\amd64;C:\Program Files (x86)\Windows Kits\10
    lib\10.0.14393.0\ucrt\x64;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib \um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.14393.0\um\x64; SET PATH=F:\Tools\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions
    Microsoft\TestWindow;F:\Tools\Microsoft Visual Studio 14.0\VC\BIN\amd64;C:\WINDO WS\Microsoft.NET\Framework64\v4.0.30319;F:\Tools\Microsoft Visual Studio 14.0\VC \VCPackages;F:\Tools\Microsoft Visual Studio 14.0\Common7\IDE;F:\Tools\Microsoft Visual Studio 14.0\Common7\Tools;F:\Tools\Microsoft Visual Studio 14.0\Team Too ls\Performance Tools\x64;F:\Tools\Microsoft Visual Studio 14.0\Team Tools\Perfor mance Tools;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86 )\Windows Kits\10\bin\x86;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\b in\NETFX 4.6.1 Tools\x64;;C:\WINDOWS\system32 SET PWD=/proc/self/cwd SET TEMP=C:\Users\Thomas\AppData\Local\Temp SET TMP=C:\Users\Thomas\AppData\Local\Temp F:/Tools/Microsoft Visual Studio 14.0/VC/bin/amd64/cl.exe /c external/protobuf /src/google/protobuf/struct.pb.cc /Fobazel-out/msvc_x64-fastbuild/bin/external/p rotobuf/objs/protobuf/external/protobuf/src/google/protobuf/struct.pb.o /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0600 /D_CRT_SECURE_NO_DEPRECATE /D CRT_SECURE_NO_WARNINGS /D_SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS /bigobj /Zm50 0 /J /Gy /GF /EHsc /wd4351 /wd4291 /wd4250 /wd4996 /Iexternal/protobuf /Ibazel-o ut/msvc_x64-fastbuild/genfiles/external/protobuf /Iexternal/bazel_tools /Ibazel- out/msvc_x64-fastbuild/genfiles/external/bazel_tools /Iexternal/protobuf/src /Ib azel-out/msvc_x64-fastbuild/genfiles/external/protobuf/src /Iexternal/bazel_tool s/tools/cpp/gcc3 /showIncludes /MT /Od /Z7 -DHAVE_PTHREAD -Wall -Wwrite-strings -Woverloaded-virtual -Wno-sign-compare -Wno-unused-function. cl: 命令行 error D8021 :无效的数值参数“/Wwrite-strings” Target //kcws/cc:seg_backend_api failed to build ____Elapsed time: 2.704s, Critical Path: 0.13s

    opened by thomas1984 2
  • 关于模型导出--output_node_names

    关于模型导出--output_node_names "transitions,Reshape_9" "transitions,Reshape_7" 什么意思

    模型导出时指定 output node 在解码的时候作为模型的输出; 训练的时候不是应该指定这两个名字吗? 我在bilstm.py 文件找到了 Reshape_7 这个output的定义 但没找到pos训练 Reshape_9 这个output的定义 以及transitions的定义, 这两个是tensorflow 默认的output node还是什么? 麻烦解释下,谢谢

    opened by forever1dream 3
Releases(test)
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

61 Dec 22, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
Document Image Dewarping

Document image dewarping using text-lines and line Segments Abstract Conventional text-line based document dewarping methods have problems when handli

Taeho Kil 268 Dec 23, 2022
An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

JunHong 1 Nov 05, 2021
Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

CaptchaSolver - LEIA ISSO 😓 Para iniciar o codigo: pip install -r requirements.txt python captcha_solver.py Se você deseja pegar ver o resultado das

Kawanderson 50 Mar 21, 2022
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

FOTS: Fast Oriented Text Spotting with a Unified Network Introduction This is a pytorch re-implementation of FOTS: Fast Oriented Text Spotting with a

GeorgeJoe 171 Aug 04, 2022
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.

PassportScanner Works with 2 and 3 line identity documents. What is this With PassportScanner you can use your camera to scan the MRZ code of a passpo

Edwin Vermeer 441 Dec 24, 2022
Image augmentation for machine learning experiments.

imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar

Alexander Jung 13.2k Jan 02, 2023
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 301 Jan 05, 2023
基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

20 Dec 15, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
Perspective recovery of text using transformed ellipses

unproject_text Perspective recovery of text using transformed ellipses. See full writeup at https://mzucker.github.io/2016/10/11/unprojecting-text-wit

Matt Zucker 111 Nov 13, 2022
Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

Daniel Jarrett 26 Jun 17, 2021
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2.2k Jan 02, 2023
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022