lxml是Python的⼀个html/xml解析并建⽴dom的库,lxml的特点是功能强⼤,性能也不错,xml包含了ElementTree ,html5lib ,beautfulsoup 等库。使⽤lxml前注意事项:先确保html经过了utf-8解码,即code =html.decode('utf-8', 'ignore'),否则会出现解析出错情况。因为中⽂被编码成utf-8之后变成 '/u...