发布:2022/12/27 9:45:24作者:管理员 来源:本站 浏览次数:1059
1、通过NuGet获取AngleSharp
1)使用Nuget管理控制台
将AngleSharp集成到项目中的最简单方法是使用NuGet。您可以通过打开包管理器控制台(PM)并键入以下语句来安装AngleSharp:
Install-Package AngleSharp
2)使用Nuget图形管理器
使用Nuget的界面的管理器搜索"AngleSharp"=> 找到点出点击"安装"。
相关文档:VS(Visual Studio)中Nuget的使用
2、使用AngleSharp解析html的示例
var source = @"
<!DOCTYPE html>
<html lang=en>
<meta charset=utf-8>
<meta name=viewport content=""initial-scale=1, minimum-scale=1, width=device-width"">
<title>Error 404 (Not Found)!!1</title>
<style>
*{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/errors/logo_sm_2.png) no-repeat}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/errors/logo_sm_2_hr.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/errors/logo_sm_2_hr.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/errors/logo_sm_2_hr.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:55px;width:150px}
</style>
www.google.com/>
<p><b>404.</b> <ins>That’s an error.</ins>
<p>The requested URL <code>/error</code> was not found on this server. <ins>That’s all we know.</ins>";
//使用AngleSharp的默认配置
var config = Configuration.Default;
//使用给定的配置创建用于评估web页面的新上下文
var context = BrowsingContext.New(config);
//只需要获得DOM表示
var document = await context.OpenAsync(req => req.Content(source));
//将其序列化回控制台
Console.WriteLine(document.DocumentElement.OuterHtml);
3、简单操作document(Dom文档)
static async Task FirstExample()
{
//使用AngleSharp的默认配置
var config = Configuration.Default;
//使用给定的配置创建用于评估web页面的新上下文
var context = BrowsingContext.New(config);
//从响应的内容解析文档到虚拟请求
var document = await context.OpenAsync(req => req.Content("<h1>Some example source</h1><p>This is a paragraph element"));
//对文档执行如下操作
Console.WriteLine("Serializing the (original) document:");
Console.WriteLine(document.DocumentElement.OuterHtml);
var p = document.CreateElement("p");
p.TextContent = "This is another paragraph.";
Console.WriteLine("Inserting another element in the body ...");
document.Body.AppendChild(p);
Console.WriteLine("Serializing the document again:");
Console.WriteLine(document.DocumentElement.OuterHtml);
}
4、获取html中的元素
static async Task UsingLinq()
{
//使用默认配置创建一个用于评估web页面的新上下文
var context = BrowsingContext.New(Configuration.Default);
//根据虚拟请求/响应模式创建文档
var document = await context.OpenAsync(req => req.Content("<ul><li>First item<li>Second item<li class='blue'>Third item!<li class='blue red'>Last item!</ul>"));
//对LINQ做点什么
var blueListItemsLinq = document.All.Where(m => m.LocalName == "li" && m.ClassList.Contains("blue"));
//或者直接使用CSS选择器
var blueListItemsCssSelector = document.QuerySelectorAll("li.blue");
Console.WriteLine("Comparing both ways ...");
Console.WriteLine();
Console.WriteLine("LINQ:");
foreach (var item in blueListItemsLinq)
{
Console.WriteLine(item.Text());
}
Console.WriteLine();
Console.WriteLine("CSS:");
foreach (var item in blueListItemsCssSelector)
{
Console.WriteLine(item.Text());
}
}
static async Task SingleElements()
{
//使用默认配置创建一个用于评估web页面的新上下文
var context = BrowsingContext.New(Configuration.Default);
//创建一个新文档
var document = await context.OpenAsync(req => req.Content("<b><i>This is some <em> bold <u>and</u> italic </em> text!</i></b>"));
var emphasize = document.QuerySelector("em");
Console.WriteLine("Difference between several ways of getting text:");
Console.WriteLine();
Console.WriteLine("Only from C# / AngleSharp:");
Console.WriteLine();
Console.WriteLine(emphasize.ToHtml()); //<em> bold <u>and</u> italic </em>
Console.WriteLine(emphasize.Text()); // bold and italic
Console.WriteLine();
Console.WriteLine("From the DOM:");
Console.WriteLine();
Console.WriteLine(emphasize.InnerHtml); // bold <u>and</u> italic
Console.WriteLine(emphasize.OuterHtml); //<em> bold <u>and</u> italic </em>
Console.WriteLine(emphasize.TextContent);// bold and italic
}
参考文档:https://anglesharp.github.io/docs/Examples.html
© Copyright 2014 - 2024 柏港建站平台 ejk5.com. 渝ICP备16000791号-4