`

Aspose.Words for .NET如何替换或修改超链接

阅读更多

在当前版本,没有在Aspose.Words嵌入功能来处理超链接字段。

超链接在Microsoft Word文档是字段,一个字段包含字段代码和字段结果,在当前版本的Aspose.Words中,没有单一的对象代表了一个字段。

以下这个例子展示了如何创建一个简单的类,它代表一个文档中的超链接。它的构造函数接受一个FieldStart对象,这个对象必须有FieldType.FieldHyperlink类型。使用超链接类后,您可以获取或设置跳转目标地址,名字和IsLocal属性。现在很容易在整个文档改变超链接目标和名称。在这个例子中,所有的超链接都改为“http://aspose.com”。

Example

找到一个Word文档所有超链接和改变他们的URL和显示名称。

 

using System;
using System.Text;
using System.Text.RegularExpressions;
using Aspose.Words;
using Aspose.Words.Fields;

namespace Examples
{
/// <summary>
/// Shows how to replace hyperlinks in a Word document.
/// </summary>
public class ExReplaceHyperlinks : ExBase
{
/// <summary>
/// Finds all hyperlinks in a Word document and changes their URL and display name.
/// </summary>
public void ReplaceHyperlinks()
{
// Specify your document name here.
Document doc = new Document(MyDir + "ReplaceHyperlinks.doc");

// Hyperlinks in a Word documents are fields, select all field start nodes so we can find the hyperlinks.
NodeList fieldStarts = doc.SelectNodes("//FieldStart");
foreach (FieldStart fieldStart in fieldStarts)
{
if (fieldStart.FieldType.Equals(FieldType.FieldHyperlink))
{
// The field is a hyperlink field, use the "facade" class to help to deal with the field.
Hyperlink hyperlink = new Hyperlink(fieldStart);

// Some hyperlinks can be local (links to bookmarks inside the document), ignore these.
if (hyperlink.IsLocal)
continue;

// The Hyperlink class allows to set the target URL and the display name
// of the link easily by setting the properties.
hyperlink.Target = NewUrl;
hyperlink.Name = NewName;
}
}

doc.Save(MyDir + "ReplaceHyperlinks Out.doc");
}

private const string NewUrl = @"http://www.aspose.com";
private const string NewName = "Aspose - The .NET & Java Component Publisher";
}

/// <summary>
/// This "facade" class makes it easier to work with a hyperlink field in a Word document.
///
/// A hyperlink is represented by a HYPERLINK field in a Word document. A field in Aspose.Words
/// consists of several nodes and it might be difficult to work with all those nodes directly.
/// Note this is a simple implementation and will work only if the hyperlink code and name
/// each consist of one Run only.
///
/// [FieldStart][Run - field code][FieldSeparator][Run - field result][FieldEnd]
///
/// The field code contains a string in one of these formats:
/// HYPERLINK "url"
/// HYPERLINK \l "bookmark name"
///
/// The field result contains text that is displayed to the user.
/// </summary>
internal class Hyperlink
{
internal Hyperlink(FieldStart fieldStart)
{
if (fieldStart == null)
throw new ArgumentNullException("fieldStart");
if (!fieldStart.FieldType.Equals(FieldType.FieldHyperlink))
throw new ArgumentException("Field start type must be FieldHyperlink.");

mFieldStart = fieldStart;

// Find the field separator node.
mFieldSeparator = fieldStart.GetField().Separator;
if (mFieldSeparator == null)
throw new InvalidOperationException("Cannot find field separator.");

mFieldEnd = fieldStart.GetField().End;

// Field code looks something like [ HYPERLINK "http:\\www.myurl.com" ], but it can consist of several runs.
string fieldCode = fieldStart.GetField().GetFieldCode();
Match match = gRegex.Match(fieldCode.Trim());
mIsLocal = (match.Groups[1].Length > 0); //The link is local if \l is present in the field code.
mTarget = match.Groups[2].Value;
}

/// <summary>
/// Gets or sets the display name of the hyperlink.
/// </summary>
internal string Name
{
get
{
return GetTextSameParent(mFieldSeparator, mFieldEnd);
}
set
{
// Hyperlink display name is stored in the field result which is a Run
// node between field separator and field end.
Run fieldResult = (Run)mFieldSeparator.NextSibling;
fieldResult.Text = value;

// But sometimes the field result can consist of more than one run, delete these runs.
RemoveSameParent(fieldResult.NextSibling, mFieldEnd);
}
}

/// <summary>
/// Gets or sets the target url or bookmark name of the hyperlink.
/// </summary>
internal string Target
{
get
{
string dummy = null; // This is needed to fool the C# to VB.NET converter.
return mTarget;
}
set
{
mTarget = value;
UpdateFieldCode();
}
}

/// <summary>
/// True if the hyperlink's target is a bookmark inside the document. False if the hyperlink is a url.
/// </summary>
internal bool IsLocal
{
get
{
return mIsLocal;
}
set
{
mIsLocal = value;
UpdateFieldCode();
}
}

private void UpdateFieldCode()
{
// Field code is stored in a Run node between field start and field separator.
Run fieldCode = (Run)mFieldStart.NextSibling;
fieldCode.Text = string.Format("HYPERLINK {0}\"{1}\"", ((mIsLocal) ? "\\l " : ""), mTarget);

// But sometimes the field code can consist of more than one run, delete these runs.
RemoveSameParent(fieldCode.NextSibling, mFieldSeparator);
}

/// <summary>
/// Retrieves text from start up to but not including the end node.
/// </summary>
private static string GetTextSameParent(Node startNode, Node endNode)
{
if ((endNode != null) && (startNode.ParentNode != endNode.ParentNode))
throw new ArgumentException("Start and end nodes are expected to have the same parent.");

StringBuilder builder = new StringBuilder();
for (Node child = startNode; !child.Equals(endNode); child = child.NextSibling)
builder.Append(child.GetText());

return builder.ToString();
}

/// <summary>
/// Removes nodes from start up to but not including the end node.
/// Start and end are assumed to have the same parent.
/// </summary>
private static void RemoveSameParent(Node startNode, Node endNode)
{
if ((endNode != null) && (startNode.ParentNode != endNode.ParentNode))
throw new ArgumentException("Start and end nodes are expected to have the same parent.");

Node curChild = startNode;
while ((curChild != null) && (curChild != endNode))
{
Node nextChild = curChild.NextSibling;
curChild.Remove();
curChild = nextChild;
}
}

private readonly Node mFieldStart;
private readonly Node mFieldSeparator;
private readonly Node mFieldEnd;
private bool mIsLocal;
private string mTarget;

/// <summary>
/// RK I am notoriously bad at regexes. It seems I don't understand their way of thinking.
/// </summary>
private static readonly Regex gRegex = new Regex(
"\\S+" + // one or more non spaces HYPERLINK or other word in other languages
"\\s+" + // one or more spaces
"(?:\"\"\\s+)?" + // non capturing optional "" and one or more spaces, found in one of the customers files.
"(\\\\l\\s+)?" + // optional \l flag followed by one or more spaces
"\"" + // one apostrophe

);
}
}

 

Imports Microsoft.VisualBasic
Imports System
Imports System.Text
Imports System.Text.RegularExpressions
Imports Aspose.Words
Imports Aspose.Words.Fields

Namespace Examples
''' <summary>
''' Shows how to replace hyperlinks in a Word document.
''' </summary>
<TestFixture> _
Public Class ExReplaceHyperlinks
Inherits ExBase
''' <summary>
''' Finds all hyperlinks in a Word document and changes their URL and display name.
''' </summary>
<Test> _
Public Sub ReplaceHyperlinks()
' Specify your document name here.
Dim doc As New Document(MyDir & "ReplaceHyperlinks.doc")

' Hyperlinks in a Word documents are fields, select all field start nodes so we can find the hyperlinks.
Dim fieldStarts As NodeList = doc.SelectNodes("//FieldStart")
For Each fieldStart As FieldStart In fieldStarts
If fieldStart.FieldType.Equals(FieldType.FieldHyperlink) Then
' The field is a hyperlink field, use the "facade" class to help to deal with the field.
Dim hyperlink As New Hyperlink(fieldStart)

' Some hyperlinks can be local (links to bookmarks inside the document), ignore these.
If hyperlink.IsLocal Then
Continue For
End If

' The Hyperlink class allows to set the target URL and the display name
' of the link easily by setting the properties.
hyperlink.Target = NewUrl
hyperlink.Name = NewName
End If
Next fieldStart

doc.Save(MyDir & "ReplaceHyperlinks Out.doc")
End Sub

Private Const NewUrl As String = "http://www.aspose.com"
Private Const NewName As String = "Aspose - The .NET & Java Component Publisher"
End Class

''' <summary>
''' This "facade" class makes it easier to work with a hyperlink field in a Word document.
'''
''' A hyperlink is represented by a HYPERLINK field in a Word document. A field in Aspose.Words
''' consists of several nodes and it might be difficult to work with all those nodes directly.
''' Note this is a simple implementation and will work only if the hyperlink code and name
''' each consist of one Run only.
'''
''' [FieldStart][Run - field code][FieldSeparator][Run - field result][FieldEnd]
'''
''' The field code contains a string in one of these formats:
''' HYPERLINK "url"
''' HYPERLINK \l "bookmark name"
'''
''' The field result contains text that is displayed to the user.
''' </summary>
Friend Class Hyperlink
Friend Sub New(ByVal fieldStart As FieldStart)
If fieldStart Is Nothing Then
Throw New ArgumentNullException("fieldStart")
End If
If (Not fieldStart.FieldType.Equals(FieldType.FieldHyperlink)) Then
Throw New ArgumentException("Field start type must be FieldHyperlink.")
End If

mFieldStart = fieldStart

' Find the field separator node.
mFieldSeparator = fieldStart.GetField().Separator
If mFieldSeparator Is Nothing Then
Throw New InvalidOperationException("Cannot find field separator.")
End If

mFieldEnd = fieldStart.GetField().End

' Field code looks something like [ HYPERLINK "http:\\www.myurl.com" ], but it can consist of several runs.
Dim fieldCode As String = fieldStart.GetField().GetFieldCode()
Dim match As Match = gRegex.Match(fieldCode.Trim())
mIsLocal = (match.Groups(1).Length > 0) 'The link is local if \l is present in the field code.
mTarget = match.Groups(2).Value
End Sub

''' <summary>
''' Gets or sets the display name of the hyperlink.
''' </summary>
Friend Property Name() As String
Get
Return GetTextSameParent(mFieldSeparator, mFieldEnd)
End Get
Set(ByVal value As String)
' Hyperlink display name is stored in the field result which is a Run
' node between field separator and field end.
Dim fieldResult As Run = CType(mFieldSeparator.NextSibling, Run)
fieldResult.Text = value

' But sometimes the field result can consist of more than one run, delete these runs.
RemoveSameParent(fieldResult.NextSibling, mFieldEnd)
End Set
End Property

''' <summary>
''' Gets or sets the target url or bookmark name of the hyperlink.
''' </summary>
Friend Property Target() As String
Get
Dim dummy As String = Nothing ' This is needed to fool the C# to VB.NET converter.
Return mTarget
End Get
Set(ByVal value As String)
mTarget = value
UpdateFieldCode()
End Set
End Property

''' <summary>
''' True if the hyperlink's target is a bookmark inside the document. False if the hyperlink is a url.
''' </summary>
Friend Property IsLocal() As Boolean
Get
Return mIsLocal
End Get
Set(ByVal value As Boolean)
mIsLocal = value
UpdateFieldCode()
End Set
End Property

Private Sub UpdateFieldCode()
' Field code is stored in a Run node between field start and field separator.
Dim fieldCode As Run = CType(mFieldStart.NextSibling, Run)
fieldCode.Text = String.Format("HYPERLINK {0}""{1}""", (If((mIsLocal), "\l ", "")), mTarget)

' But sometimes the field code can consist of more than one run, delete these runs.
RemoveSameParent(fieldCode.NextSibling, mFieldSeparator)
End Sub

''' <summary>
''' Retrieves text from start up to but not including the end node.
''' </summary>
Private Shared Function GetTextSameParent(ByVal startNode As Node, ByVal endNode As Node) As String
If (endNode IsNot Nothing) AndAlso (startNode.ParentNode IsNot endNode.ParentNode) Then
Throw New ArgumentException("Start and end nodes are expected to have the same parent.")
End If

Dim builder As New StringBuilder()
Dim child As Node = startNode
Do While Not child.Equals(endNode)
builder.Append(child.GetText())
child = child.NextSibling
Loop

Return builder.ToString()
End Function

''' <summary>
''' Removes nodes from start up to but not including the end node.
''' Start and end are assumed to have the same parent.
''' </summary>
Private Shared Sub RemoveSameParent(ByVal startNode As Node, ByVal endNode As Node)
If (endNode IsNot Nothing) AndAlso (startNode.ParentNode IsNot endNode.ParentNode) Then
Throw New ArgumentException("Start and end nodes are expected to have the same parent.")
End If

Dim curChild As Node = startNode
Do While (curChild IsNot Nothing) AndAlso (curChild IsNot endNode)
Dim nextChild As Node = curChild.NextSibling
curChild.Remove()
curChild = nextChild
Loop
End Sub

Private ReadOnly mFieldStart As Node
Private ReadOnly mFieldSeparator As Node
Private ReadOnly mFieldEnd As Node
Private mIsLocal As Boolean
Private mTarget As String

''' <summary>
''' RK I am notoriously bad at regexes. It seems I don't understand their way of thinking.
''' </summary>
Private Shared ReadOnly gRegex As New Regex("\S+" & "\s+" & "(?:""""\s+)?" & "(\\l\s+)?" & """" & "([^""]+)" & """" )
End Class
End Namespace

下载最新版Aspose.Words

 

0
2
分享到:
评论

相关推荐

    Aspose.Words For .NET 生成word和pdf 支持模板关键字替换图片替换

    Aspose.Words for .NET 是一个强大的文档处理库,它允许开发者在.NET环境中创建、编辑、格式化和转换Microsoft Word文档以及PDF文件。这个库以其高效性和灵活性而著名,能够帮助程序员实现各种复杂的文档操作,比如...

    Aspose.Words.dll 17.7 无限制,无乱码,亲测

    Aspose.Words.dll 是一个由Aspose公司开发的.NET组件,主要用于处理Microsoft Word文档。版本17.7代表这是该库的一个特定更新版本,通常会包含性能改进、新功能和错误修复。在这个版本中,"无限制,无乱码"的描述意味...

    Aspose.Words.zip

    通过结合数据源(如Excel表格、CSV文件或数据库),Aspose.Words能自动替换文档中的占位符,生成每份文档的唯一内容,极大地提高了工作效率。 4. **编程接口**:Aspose.Words提供了直观且易用的编程接口,支持C#、...

    Aspose.Words实现对word的操作

    Aspose.Words是一款强大的.NET库,它允许开发者在没有Microsoft Word的情况下进行各种复杂的Word文档处理操作。这个库提供了丰富的API,可以实现创建、编辑、格式化、转换、打印Word文档,以及嵌入图像、表格、图表...

    aspose.words.jdk16-7.0.0

    【标题】"aspose.words.jdk16-7.0.0" 是一个特定版本的Aspose.Words库,专为使用Java Development Kit (JDK) 16编译和运行的环境设计。Aspose.Words是一个强大的文档处理库,允许开发人员在Java应用程序中创建、编辑...

    aspose.words 17年6月无水印版

    1. .NET Framework 2.0支持:此版本的Aspose.Words明确支持.NET Framework 2.0,这意味着开发者可以使用.NET 2.0或其后的任何版本(如3.0、3.5、4.x等)来调用这个库。这使得拥有较旧基础设施的项目也能利用Aspose....

    Aspose.Words18.7.rar

    Aspose.Words是一款强大的.NET库,专为处理Microsoft Word文档而设计。18.7版本提供了全面的功能,包括创建、编辑、格式化以及转换Word文档。这个版本经过完美破解,适用于ASP.NET环境,确保开发者可以在Web应用中...

    word转pdf使用Aspose.words插件无水印版本

    Aspose.Words是一个由Aspose公司开发的.NET库,它提供了API,允许程序员在代码中直接操作Word文档,包括创建、编辑、格式化和转换文档。这款插件无需安装Microsoft Office,因此可以在没有Office环境的服务器上运行...

    aspose-words-15.8.0_word转pdf_aspose-words15.8_aspose15.8_aspose-

    在实际应用中,Aspose.Words不仅支持基本的转换操作,还能处理复杂的布局、图像、表格和超链接等元素,确保转换后的PDF文件与原始Word文档保持一致。 在使用Aspose.Words进行Word到PDF转换时,需要注意的是,完整...

    Aspose.Words dll

    Aspose.Words适用于各种.NET应用程序,包括ASP.NET、Windows Forms、WPF以及Console应用程序。由于其广泛的文档处理能力,它被广泛应用于企业级的文档自动化、报表生成、文档转换等场景。在实际开发中,开发者可以...

    aspose.words 17.7破解版,17年7月最新版本

    Aspose.Words是一款著名的.NET库,专为处理Microsoft Word文档而设计。这个17.7版本是2017年7月发布的,旨在提供对PDF转换功能的改进和错误修复。在深入探讨这个版本之前,我们先理解一下Aspose.Words的基本功能。 ...

    使用aspose将word转为pdf

    如果你使用的是.NET Framework或.NET Core,可以通过NuGet包管理器安装Aspose.Words。在命令行或Package Manager Console中输入`Install-Package Aspose.Words`。 2. **加载Word文档**:然后,你需要使用Aspose....

    word&HTML; 转 PDF 相关的aspose_license.jar和aspose-words-14.9.0-jdk16.jar

    Aspose.Words是一个强大的API,支持多种平台,如.NET、Java、Python、Node.js等,能帮助开发者在程序中实现对Microsoft Word文档的创建、编辑和转换功能。 具体到这两个jar包: 1. aspose_license.jar:这是Aspose...

    C# ASP.NET - 将Web网页导出至Word文档

    在C# ASP.NET环境中,将Web网页导出到Word文档是一项常见的需求,这通常涉及到...此外,如果你的应用场景对性能有较高要求,可以考虑使用更专业的库,如Aspose.Words或Spire.Doc,它们提供了更完善的HTML转Word功能。

    aspose-words-cloud-dotnet:.NET库,用于与Aspose.Words Cloud API通信

    用于.Net的Cloud SDK封装了Aspose.Words REST API,因此您可以将MicrosoftWord:registered:文档生成,操作,转换和检查功能无缝集成到您自己的.Net应用程序中。 云中的Word文档处理 允许使用文档页眉,页脚,页码,...

    Aspose类,PDF转word

    Aspose是一个著名的第三方库,专为开发者提供各种文件格式处理能力,包括PDF、Word、Excel、Email等。在本文中,我们将重点讨论Aspose如何帮助我们实现PDF到Word的转换,这是一种在许多业务场景中都非常实用的功能。...

    Aspose(Word,Cell,Pdf).rar

    开发者可以利用Aspose.Pdf生成动态PDF表单,添加书签,插入、提取或替换页面,甚至将其他格式的文档(如HTML、DOC、XLS)转换为PDF。 2. **Aspose.Words.dll**: Aspose.Words是处理Microsoft Word文档的专业工具...

    aspose-20.7.zip

    Aspose是一个强大的API库,支持多种编程语言,如Java、.NET、Python等,使得开发者能够方便地处理各种文档格式,包括Word(.docx)、PowerPoint(.pptx)和Excel(.xlsx)。 **Aspose.Words**: 包含在"words-20.7....

    aspose.zip

    AsposeWords的API允许开发者在代码中指定转换参数,比如是否保留超链接、表格样式等,确保转换后的PDF与原始Word文档尽可能一致。 2. **PDF转图片**:Aspose.Pdf组件支持将PDF文档中的每一页转换为图像文件,如JPEG...

    C#中PDF文件转WORD文件(完整版)

    - **转换为Word格式**:根据解析出的内容,使用Aspose.Words或Spire.PDF等库创建一个新的Word文档,并逐个添加元素。这可能涉及文本流的构建、样式应用、图像插入等操作。 - **保存Word文档**:完成转换后,将Word...

Global site tag (gtag.js) - Google Analytics