PaddleSeg基于机器学习的图像分割
PyCharm安装PaddleSeg
在PyCharm欢迎页通过Get from VCS拉取github上PaddlePaddle/PaddleSeg项目
拉取成功后,PyCharm会提示创建venv虚拟环境,点击确定即可,在PyCharm打开终端命令行前面有venv代表创建成功,如果没有,关闭终端点击
Add new Interpreter
,添加后,重新在PyCharm打开终端1
2例如:
(venv) ➜ PaddleSeg git:(release/2.9)安装paddle,在PyCharm中Terminal命令行中执行
pip install paddlepaddle
安装setuptools,在PyCharm中Terminal命令行中执行
pip install -U pip setuptools
[可选],检查paddle是否安装成功,在PyCharm中Python console执行
import paddle
和paddle.utils.run_check()
,检查版本执行print(paddle.__version__)
安装paddleseg,在PyCharm中Terminal命令行中执行
pip install paddleseg
,(这里采用的直接安装发布的版本,本地编译未实验通过)验证安装是否成功,在在PyCharm中Terminal命令行中执行
sh tests/install/check_predict.sh
PaddleSeg使用
准备自定义数据集
标注工具(标注数据)
PddleSeg已支持2种标注工具:LabelMe、精灵数据标注工具
下载安装labelme
根据文档操作,进行标注数据,简单流程
打开图片目录->创建多边形->框选要标注的数据->保存
会在图片目录生成一个json文件,标注目录下的所有图片1
2
3
4
5
6
7
8
9
10
11
12
13#标注前目录结构
paddle
|--image1.jpg
|--image2.jpg
|--...
#标注后目录结构
paddle
|--image1.jpg
|--image2.jpg
|--...
|--image1.json
|--image2.json
|--...将标注的数据转换为模型训练时所需的数据格式
1
2
3
4
5
6
7
8
9
10
11
12
13
14#python tools/data/labelme2seg.py [-h] 图片目录 输出目录
python tools/data/labelme2seg.py /Users/x/Downloads/paddle /Users/xuanleung/Downloads/paddle_pr
#转换后目录结构
paddle
paddle_pr
|--annotations #红色背景,绿色标注的图片
| |--image1.png
| |--image2.png
| |--...
|--images
| |--image1.jpg
| |--image2.jpg
| |--...
|--class_names.txt #是数据集中所有标注类别的名称
切分数据
切分数据,对于所有原始图像和标注图像,需要按照比例划分为训练集、验证集、测试集。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20#python tools/data/split_dataset_list.py <dataset_root:原始图像目录名> <images_dir_name:原始图像目录名> <labels_dir_name:标注图像目录名> ${FLAGS}
python tools/data/split_dataset_list.py /Users/x/Downloads/paddle_pr images annotations --split 0.6 0.2 0.2 --format jpg png
#执行后目录结构
paddle_pr
|--test.txt #测试集
|--val.txt #验证集
|--train.txt #训练集
|--annotations #红色背景,绿色标注的图片
| |--image1.png
| |--image2.png
| |--...
|--images
| |--image1.jpg
| |--image2.jpg
| |--...
|--class_names.txt #是数据集中所有标注类别的名称
#txt文件内容格式
images/image1.jpg annotations/image1.png
images/image2.jpg annotations/image2.png
....FLAGS说明:
FLAG 含义 默认值 参数数目 –split 训练集、验证集和测试集的切分比例 0.7 0.3 0 3 –separator txt文件列表分隔符 “ “ 1 –format 原始图像和标注图像的图片后缀 “jpg” “png” 2 –postfix 按文件主名(无扩展名)是否包含指定后缀对图片和标签集进行筛选 “” “”(2个空字符) 2
准备配置文件
拷贝paddle_pr
数据集目录到项目根目录,在项目下面的目录configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml
,修改这两个路径配置:
1 | train_dataset: |
配置文件说明:
1 | batch_size: 4 #设定batch_size的值即为迭代一次送入网络的图片数量,一般显卡显存越大,batch_size的值可以越大。如果使用多卡训练,总得batch size等于该batch size乘以卡数。 |
训练模型
使用pyCharm添加一个python run config
的启动配置,script设置为tools/train.py
,script parameters设置为--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output
,然后点击右上角的run
。
运行相关的日志如下:
1 | /Users/x/workspace/PaddleSeg/venv/bin/python -X pycache_prefix=/Users/x/Library/Caches/JetBrains/PyCharm2023.3/cpython-cache /Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 64923 --file /Users/x/workspace/PaddleSeg/tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output |
生成的文件:
1 | output |
模型评估
拷贝paddle_pr
数据集目录到项目根目录,在项目下面的目录configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml
,修改这两个路径配置:
1 | val_dataset: |
使用pyCharm添加一个python run config
的启动配置,script设置为tools/val.py
,script parameters设置为--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/iter_1000/model.pdparams
,然后点击右上角的run
。
输出结果:
1 | /Users/x/workspace/PaddleSeg/venv/bin/python tools/val.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/iter_1000/model.pdparams |
模型预测
使用pyCharm添加一个python run config
的启动配置,script设置为tools/predict.py
,script parameters设置为--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/iter_1000/model.pdparams --image_path paddle_pr/images/image1.jpg --save_dir output/result
,然后点击右上角的run
。
输出结果:
1 | /Users/x/workspace/PaddleSeg/venv/bin/python tools/predict.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/iter_1000/model.pdparams --image_path paddle_pr/images/image1.jpg --save_dir output/result |
生成的文件:
1 | output/result |
导出模型
使用pyCharm添加一个python run config
的启动配置,script设置为tools/export.py
,script parameters设置为--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/best_model/model.pdparams --save_dir output/inference_model
,然后点击右上角的run
。
输出结果:
1 | /Users/x/workspace/PaddleSeg/venv/bin/python tools/export.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --model_path output/best_model/model.pdparams --save_dir output/inference_model |
生成的文件:
1 | output/inference_model |
部署模型(Paddle Inference部署python)
使用pyCharm添加一个python run config
的启动配置,script设置为deploy/python/infer.py
,script parameters设置为-config output/inference_model/deploy.yaml --image_path /Users/xuanleung/Downloads/test.jpg --device cpu
,然后点击右上角的run
。
生产的文件:output/test.png
MacOS安装
1 | #检查环境 |
macOS docker 安装
1 | mkdir paddle |
常见问题
安装
python3 -m pip install paddlepaddle==2.6.0 -i https://mirror.baidu.com/pypi/simple
提示如下错误1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18➜ ~ python3 -m pip install paddlepaddle==2.6.0 -i https://mirror.baidu.com/pypi/simple
error: externally-managed-environment
× This environment is externally managed
╰─> To install Python packages system-wide, try brew install
xyz, where xyz is the package you are trying to
install.
If you wish to install a non-brew-packaged Python package,
create a virtual environment using python3 -m venv path/to/venv.
Then use path/to/venv/bin/python and path/to/venv/bin/pip.
If you wish to install a non-brew packaged Python application,
it may be easiest to use pipx install xyz, which will manage a
virtual environment for you. Make sure you have pipx installed.
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.解决:参考pip(3) install,完美解决 externally-managed-environment
方案一:添加参数–break-system-packages,这种直接安装到系统,可能会影响系统环境。
1
python3 -m pip install paddlepaddle==2.6.0 -i https://mirror.baidu.com/pypi/simple --break-system-packages
方案二:pipx,安装完成后,无法导入paddle,初步判断无法进入虚拟环境
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28# pipx会为每个安装的应用创建一个独立的虚拟环境,避免不同应用之间的依赖冲突。
➜ ~ brew install pipx
#当你首次安装pipx时,运行pipx ensurepath会自动检查并修改你的环境变量设置(如需要的话),以确保你可以轻松运行pipx安装的任何程序。这个步骤通常只需要执行一次。
➜ ~ pipx ensurepath
#替换官方的安装命令,使用pipx安装
# 2.9.0找不到
➜ ~ pipx install paddlepaddle==2.9.0 -i https://mirror.baidu.com/pypi/simple
Fatal error from pip prevented installation. Full pip output in file:
/Users/xuanleung/Library/Logs/pipx/cmd_2024-03-25_15.41.49_pip_errors.log
Some possibly relevant errors from pip install:
ERROR: Could not find a version that satisfies the requirement paddlepaddle==2.9.0 (from versions: 2.6.0)
ERROR: No matching distribution found for paddlepaddle==2.9.0
Error installing paddlepaddle from spec 'paddlepaddle==2.9.0'.
# 换成安装2.6.0
➜ ~ pipx install paddlepaddle==2.6.0 -i https://mirror.baidu.com/pypi/simple
installed package paddlepaddle 2.6.0, installed using Python 3.12.2
These apps are now globally available
- fleetrun
- paddle
⚠️ Note: '/Users/xuanleung/.local/bin' is not on your PATH environment
variable. These apps will not be globally accessible until your PATH is
updated. Run `pipx ensurepath` to automatically add it, or manually modify
your PATH in your shell's config file (i.e. ~/.bashrc).
done! ✨ 🌟 ✨
# 再次执行更新环境变量
➜ ~ pipx ensurepath【采用】方案三:使用venv
执行
import paddle
提示Python 3: ImportError “No Module named Setuptools”1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21>>> import paddle
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/__init__.py", line 28, in <module>
from .base import core # noqa: F401
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/base/__init__.py", line 77, in <module>
from . import dataset
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/base/dataset.py", line 20, in <module>
from ..utils import deprecated
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/utils/__init__.py", line 16, in <module>
from . import ( # noqa: F401
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/utils/cpp_extension/__init__.py", line 15, in <module>
from .cpp_extension import (
File "/Users/xx/workspace/paddleseg_python3/lib/python3.12/site-packages/paddle/utils/cpp_extension/cpp_extension.py", line 21, in <module>
import setuptools
ModuleNotFoundError: No module named 'setuptools'
>>> paddle.utils.run_check()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'paddle' is not defined解决:执行
pip install -U pip setuptools
执行
1
2
3
4
5
6python tools/train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--do_eval \
--use_vdl \
--save_interval 500 \
--save_dir output提示如下错误:
1
2
3
4
5
6
7
8
9
10
11
12
13(venv) ➜ PaddleSeg git:(release/2.9) ✗ python tools/train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--save_interval 500 \
--do_eval \
--use_vdl \
--save_dir output
Traceback (most recent call last):
File "/Users/x/workspace/PaddleSeg/tools/train.py", line 213, in <module>
main(args)
File "/Users/x/workspace/PaddleSeg/tools/train.py", line 145, in main
cfg = Config(
^^^^^^^
TypeError: Config.__init__() got an unexpected keyword argument 'to_static_training'原因:
tools/train.py
文件里面缺少参数to_static_training
解决:切换到release/2.8.1
分支,再次执行命令执行
1
2
3
4
5
6python tools/train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--do_eval \
--use_vdl \
--save_interval 500 \
--save_dir output提示如下错误:
1
2
3
4
5
6
7
8
9
10
11
12
13
14- type: CrossEntropyLoss
- type: CrossEntropyLoss
- type: CrossEntropyLoss
/Users/xuanleung/workspace/PaddleSeg/venv/lib/python3.12/site-packages/paddle/nn/layer/norm.py:824: UserWarning: When training, we now always track global mean and variance.
warnings.warn(
Traceback (most recent call last):
File "/Users/x/workspace/PaddleSeg/tools/train.py", line 195, in <module>
main(args)
File "/Users/x/workspace/PaddleSeg/tools/train.py", line 170, in main
train(
File "/Users/x/workspace/PaddleSeg/venv/lib/python3.12/site-packages/paddleseg/core/train.py", line 273, in train
avg_loss_list = [l[0] / log_iters for l in avg_loss_list]
~^^^
IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed解决:使用pyCharm添加一个
python run config
的启动配置,script设置为tools/train.py
,script parameters设置为--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output
,然后点击右上角的run
。