Spring之Document获取


top-10-reasons-to-use-spring-framework-1.jpg

1. 引入

之前文章中我们分析了BeanDefinition的加载,其中在 org.springframework.beans.factory.xml.XmlBeanDefinitionReader#doLoadBeanDefinitions(...) 方法中有一个操作 doLoadDocument(inputSource, resource) , 这里是将 inputstream 转换为 xml document , 方法的具体实现如下:

protected Document doLoadDocument(InputSource inputSource, Resource resource) throws Exception {
        //getValidationModeForResource是数据验证模型
        //加载 inputsteam 中 解析出document对象
        return this.documentLoader.loadDocument(inputSource, getEntityResolver(), this.errorHandler,
                getValidationModeForResource(resource), isNamespaceAware());
    }

方法中主要有两个模块,一个是 #getValidationModeForResource(...) 方法,数据验证模型获取,在上一节文章中讲过,这里不再赘述,另一个是 #loadDocument(InputSource inputSource, EntityResolver entityResolver,ErrorHandler errorHandler, int validationMode, boolean namespaceAware) 方法,今天我们要说的就是这个方法的详细实现和涉及的document相关的内容,接下来我们从主要的几个核心类来入手分析。

2. DocumentLoader

loadDocument 方法是 DocumentLoader 的方法,我们先看一下这个类的具体内容,这是一个顶级接口

/**
 * 用户加载xml文档的策略接口
 * Strategy interface for loading an XML {@link Document}.
 *
 * @author Rob Harrop
 * @since 2.0
 * @see DefaultDocumentLoader
 */
public interface DocumentLoader {
    /**
     *  从inputstream 中加载 document
     * Load a {@link Document document} from the supplied {@link InputSource source}.
     * @param inputSource the source of the document that is to be loaded  加载document的源
     * @param entityResolver the resolver that is to be used to resolve any entities 实体的解析器
     * @param errorHandler used to report any errors during document loading 文档加载错误的处理器
     * @param validationMode the type of validation 数据验证模型
     * {@link org.springframework.util.xml.XmlValidationModeDetector#VALIDATION_DTD DTD}
     * or {@link org.springframework.util.xml.XmlValidationModeDetector#VALIDATION_XSD XSD})
     * @param namespaceAware {@code true} if support for XML namespaces is to be provided 是都需要支持 名称空间
     * @return the loaded {@link Document document}
     * @throws Exception if an error occurs
     */
    Document loadDocument(
            InputSource inputSource, EntityResolver entityResolver,
            ErrorHandler errorHandler, int validationMode, boolean namespaceAware)
            throws Exception;
}

我们可以看到这个类是一个顶级接口,内部只定义了一个 loadDocument 方法,spring 只提供了一个默认的实现类 DefaultDocumentLoader , 我们自己也可以进行自定义扩展。

3. DefaultDocumentLoader

DocumentLoader 的默认实现,我们看一下唯一实现的方法 #loadDocument(...) 的代码:

/**
     * 使用jaxp从input stream中加载document
     * Load the {@link Document} at the supplied {@link InputSource} using the standard JAXP-configured
     * XML parser.
     */
    @Override
    public Document loadDocument(InputSource inputSource, EntityResolver entityResolver,
            ErrorHandler errorHandler, int validationMode, boolean namespaceAware) throws Exception {
        //1. 通过 数据验证模型 与 xml名称工具是都支持 来获取相应的 factory
        DocumentBuilderFactory factory = createDocumentBuilderFactory(validationMode, namespaceAware);
        if (logger.isTraceEnabled()) {
            logger.trace("Using JAXP provider [" + factory.getClass().getName() + "]");
        }
        // 2. 通过工厂获取 documentbuilder 对象 -->  com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
        DocumentBuilder builder = createDocumentBuilder(factory, entityResolver, errorHandler);
        // 3. 解析inputstream 获取 document
        return builder.parse(inputSource);
    }

上面代码中主要有两个方法, #createDocumentBuilderFactory(...) 获取builder 工厂, #createDocumentBuilder(...) 获取builder

  • #createDocumentBuilderFactory(...) 方法的具体实现如下:
/**
     * 创建documentBuider工厂,通过检验模型和是否支持xml名称空间
     * Create the {@link DocumentBuilderFactory} instance.
     * @param validationMode the type of validation: {@link XmlValidationModeDetector#VALIDATION_DTD DTD}
     * or {@link XmlValidationModeDetector#VALIDATION_XSD XSD})
     * @param namespaceAware whether the returned factory is to provide support for XML namespaces
     * @return the JAXP DocumentBuilderFactory
     * @throws ParserConfigurationException if we failed to build a proper DocumentBuilderFactory
     */
    protected DocumentBuilderFactory createDocumentBuilderFactory(int validationMode, boolean namespaceAware)
            throws ParserConfigurationException {
        // 默认创建 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl 类
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(namespaceAware);
        //1.判断是否有验证
        if (validationMode != XmlValidationModeDetector.VALIDATION_NONE) {
            factory.setValidating(true);
            if (validationMode == XmlValidationModeDetector.VALIDATION_XSD) {
                // Enforce namespace aware for XSD...
                //如果是xsd验证 则体重xml名称空间支持
                factory.setNamespaceAware(true);
                try {
                    factory.setAttribute(SCHEMA_LANGUAGE_ATTRIBUTE, XSD_SCHEMA_LANGUAGE);
                }
                catch (IllegalArgumentException ex) {
                    ParserConfigurationException pcex = new ParserConfigurationException(
                            "Unable to validate using XSD: Your JAXP provider [" + factory +
                            "] does not support XML Schema. Are you running on Java 1.4 with Apache Crimson? " +
                            "Upgrade to Apache Xerces (or Java 1.5) for full XSD support.");
                    pcex.initCause(ex);
                    throw pcex;
                }
            }
        }
        return factory;
    }
  • #createDocumentBuilder(...) 方法的具体实现:
protected DocumentBuilder createDocumentBuilder(DocumentBuilderFactory factory,
            @Nullable EntityResolver entityResolver, @Nullable ErrorHandler errorHandler)
            throws ParserConfigurationException {
        // 1.创建 documentBuidler
        DocumentBuilder docBuilder = factory.newDocumentBuilder();
        // 2.设置 EntityResolver 解析器
        if (entityResolver != null) {
            docBuilder.setEntityResolver(entityResolver);
        }
        // 3.设置 ErrorHandler 异常处理器
        if (errorHandler != null) {
            docBuilder.setErrorHandler(errorHandler);
        }
        return docBuilder;
    }

整体流程是相对简单的,上面代码主要有 EntityResolverErrorHandler 的设置, 关于 EntityResolver 在文章下面细说

  • 最后调用 DocumentBuilder#parse(...) 方法来解析获得 xml document 对象

4. EntityResolver

在前面调用 XmlBeanDefinitionReader#doLoadDocument(...) 方法时,我们有一个参数是通过 #getEntityResolver() 方法获取到的,这里获取到的就是 EntityResolver 对象,首先它有很多接口,我们这里只看我们要用到的几个接口,类图如下

entityResolver类图.png


org.xml.sax.EntityResolver 接口代码:

public interface EntityResolver {
    /**
     * publicId :被引用的外部实体的公共标识符,如果没有提供,则返回 null 。
     * systemId :被引用的外部实体的系统标识符。
     *
     */
    public abstract InputSource resolveEntity (String publicId,
                                               String systemId)
        throws SAXException, IOException;
}

实现类认识:

  • org.springframework.beans.factory.xm.BeansDtdResolver :实现 EntityResolver 接口, Spring Bean dtd 解码器,用来从 classpath 或者 jar 文件中加载 dtd 。部分代码如下:
private static final String DTD_EXTENSION = ".dtd";
private static final String DTD_NAME = "spring-beans";
  • org.springframework.beans.factory.xml.PluggableSchemaResolver ,实现 EntityResolver 接口,读取 classpath 下的所有 "META-INF/spring.schemas" 成一个 namespaceURISchema 文件地址的 map 。代码如下:
/**
 * The location of the file that defines schema mappings.
 * Can be present in multiple JAR files.
 *
 * 默认 {@link #schemaMappingsLocation} 地址
 */
public static final String DEFAULT_SCHEMA_MAPPINGS_LOCATION = "META-INF/spring.schemas";
@Nullable
private final ClassLoader classLoader;
/**
 * Schema 文件地址
 */
private final String schemaMappingsLocation;
/** Stores the mapping of schema URL -> local schema path. */
@Nullable
private volatile Map<String, String> schemaMappings; // namespaceURI 与 Schema 文件地址的映射集合
  • org.springframework.beans.factory.xml.DelegatingEntityResolver :实现 EntityResolver 接口,分别代理 dtdBeansDtdResolverxml schemasPluggableSchemaResolver 。代码如下:
/** Suffix for DTD files. */
public static final String DTD_SUFFIX = ".dtd";
/** Suffix for schema definition files. */
public static final String XSD_SUFFIX = ".xsd";
private final EntityResolver dtdResolver;
private final EntityResolver schemaResolver;
// 默认
public DelegatingEntityResolver(@Nullable ClassLoader classLoader) {
    this.dtdResolver = new BeansDtdResolver();
    this.schemaResolver = new PluggableSchemaResolver(classLoader);
}
// 自定义
public DelegatingEntityResolver(EntityResolver dtdResolver, EntityResolver schemaResolver) {
    Assert.notNull(dtdResolver, "'dtdResolver' is required");
    Assert.notNull(schemaResolver, "'schemaResolver' is required");
    this.dtdResolver = dtdResolver;
    this.schemaResolver = schemaResolver;
}

org.springframework.beans.factory.xml.ResourceEntityResolver :继承自 DelegatingEntityResolver 类,通过 ResourceLoader 来解析实体的引用。代码如下:

private final ResourceLoader resourceLoader;
public ResourceEntityResolver(ResourceLoader resourceLoader) {
    super(resourceLoader.getClassLoader());
    this.resourceLoader = resourceLoader;
}

4.1 作用

FROM 《Spring 源码深度解析》
loadDocument 方法中涉及一个参数 EntityResolver ,何为 EntityResolver ?官网这样解释:如果 SAX 应用程序需要实现自定义处理外部实体,则必须实现此接口并使用 setEntityResolver 方法向 SAX 驱动器注册一个实例。也就是说,对于解析一个 XMLSAX 首先读取该 XML 文档上的声明,根据声明去寻找相应的 DTD 定义,以便对文档进行一个验证。默认的寻找规则,即通过网络(实现上就是声明的 DTDURI 地址)来下载相应的 DTD 声明,并进行认证。下载的过程是一个漫长的过程,而且当网络中断或不可用时,这里会报错,就是因为相应的 DTD 声明没有被找到的原因。 EntityResolver 的作用是项目本身就可以提供一个如何寻找 DTD 声明的方法,即由程序来实现寻找 DTD 声明的过程,比如我们将 DTD 文件放到项目中某处,在实现时直接将此文档读取并返回给 SAX 即可。这样就避免了通过网络来寻找相应的声明。

4.2 getEntityResolver

XmlBeanDefinitionReader#getEntityResolver(...) 中我们作为切入点来看具体的几个实现,首先看一下这段代码:

/**
     * 返回一个文档解析器,如果没指定,则构建默认的解析器
     * Return the EntityResolver to use, building a default resolver
     * if none specified.
     */
    protected EntityResolver getEntityResolver() {
        // 1. 如果当前类没有指定 则进行默认获取,否则直接返回
        if (this.entityResolver == null) {
            // Determine default EntityResolver to use.
            //2. 获取资源加载器
            ResourceLoader resourceLoader = getResourceLoader();
            if (resourceLoader != null) {
                //3. 如果资源加载器存在,则通过资源加载器获取  ResourceEntityResolver
                this.entityResolver = new ResourceEntityResolver(resourceLoader);
            }
            else {
                //4. 如果资源加载器不存在 则执行类加载器 获取 DelegatingEntityResolver
                this.entityResolver = new DelegatingEntityResolver(getBeanClassLoader());
            }
        }
        return this.entityResolver;
    }

上面代码中,我们可以看到,当 entityResolver 没指定时才会执行,当 resourceloader 不为空时会调用 new ResourceEntityResolver(resourceLoader) 来获取解析器, 否则调用 new DelegatingEntityResolver(getBeanClassLoader())

4.3 DelegatingEntityResolver

EntityResolver 的一个实现类,同时做了委派分发,看一下构造函数:

/** Suffix for DTD files. */
    public static final String DTD_SUFFIX = ".dtd";
    /** Suffix for schema definition files. */
    public static final String XSD_SUFFIX = ".xsd";
    /**
     * 创建默认的解析器
     * Create a new DelegatingEntityResolver that delegates to
     * a default {@link BeansDtdResolver} and a default {@link PluggableSchemaResolver}.
     * <p>Configures the {@link PluggableSchemaResolver} with the supplied
     * {@link ClassLoader}.
     * @param classLoader the ClassLoader to use for loading
     * (can be {@code null}) to use the default ClassLoader)
     */
    public DelegatingEntityResolver(@Nullable ClassLoader classLoader) {
        //dtd解析器
        this.dtdResolver = new BeansDtdResolver();
        //xsd解析器
        this.schemaResolver = new PluggableSchemaResolver(classLoader);
    }
    /**
     * 指定 dtd 和 xsd 的解析器来创建
     * Create a new DelegatingEntityResolver that delegates to
     * the given {@link EntityResolver EntityResolvers}.
     * @param dtdResolver the EntityResolver to resolve DTDs with
     * @param schemaResolver the EntityResolver to resolve XML schemas with
     */
    public DelegatingEntityResolver(EntityResolver dtdResolver, EntityResolver schemaResolver) {
        Assert.notNull(dtdResolver, "'dtdResolver' is required");
        Assert.notNull(schemaResolver, "'schemaResolver' is required");
        this.dtdResolver = dtdResolver;
        this.schemaResolver = schemaResolver;
    }
  • 我们可以看到上述代码, DelegatingEntityResolver 提供了两个构造函数,一个是创建默认的解析器,一个是指定对应的解析器,这里主要将任务委派给 BeansDtdResolverPluggableSchemaResolver 两个实现类,
/**
     * publicId :被引用的外部实体的公共标识符,如果没有提供,则返回 null 。
     * systemId :被引用的外部实体的系统标识符。
     *
     */
    @Override
    @Nullable
    public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId)
            throws SAXException, IOException {
        if (systemId != null) {
            //1. 如果是 dtd后缀
            if (systemId.endsWith(DTD_SUFFIX)) {
                return this.dtdResolver.resolveEntity(publicId, systemId);
            }
            //2. 如果是xsd后缀
            else if (systemId.endsWith(XSD_SUFFIX)) {
                return this.schemaResolver.resolveEntity(publicId, systemId);
            }
        }
        // Fall back to the parser's default behavior.
        return null;
    }

4.4 BeansDtdResolver

我们通过上面可以看到 解析 DTD 时调用的解析器是 BeansDtdResolver 类,该类也是实现了 EntityResolver 接口,实现了 #resolveEntity(...) 方法,我们看一下核心代码:

@Override
    @Nullable
    public InputSource resolveEntity(@Nullable String publicId, @Nullable String systemId) throws IOException {
        if (logger.isTraceEnabled()) {
            logger.trace("Trying to resolve XML entity with public ID [" + publicId +
                    "] and system ID [" + systemId + "]");
        }
        //1. 校验是否是 dtd 后缀
        if (systemId != null && systemId.endsWith(DTD_EXTENSION)) {
            int lastPathSeparator = systemId.lastIndexOf('/');
            int dtdNameStart = systemId.indexOf(DTD_NAME, lastPathSeparator);
            //2. 校验路径的 名字是都是 spring-beans
            if (dtdNameStart != -1) {
                String dtdFile = DTD_NAME + DTD_EXTENSION;
                if (logger.isTraceEnabled()) {
                    logger.trace("Trying to locate [" + dtdFile + "] in Spring jar on classpath");
                }
                try {
                    //3. spring-beans.dtd 加载
                    Resource resource = new ClassPathResource(dtdFile, getClass());
                    InputSource source = new InputSource(resource.getInputStream());
                    source.setPublicId(publicId);
                    source.setSystemId(systemId);
                    if (logger.isTraceEnabled()) {
                        logger.trace("Found beans DTD [" + systemId + "] in classpath: " + dtdFile);
                    }
                    return source;
                }
                catch (FileNotFoundException ex) {
                    if (logger.isDebugEnabled()) {
                        logger.debug("Could not resolve beans DTD [" + systemId + "]: not found in classpath", ex);
                    }
                }
            }
        }
        // 使用默认行为,从网络上下载
        // Fall back to the parser's default behavior.
        return null;
    }

这里主要是对 systemId 进行了检验,并且通过资源加载器构建了一个inputstream对象返回

4.5 PluggableSchemaResolver

该类的解析过程和上面的 BeansDtdResolver 相似 ,具体如下:

@Nullable
private final ClassLoader classLoader;
/**
 * Schema 文件地址
 */
private final String schemaMappingsLocation;
/** Stores the mapping of schema URL -> local schema path. */
@Nullable
private volatile Map<String, String> schemaMappings; // namespaceURI 与 Schema 文件地址的映射集合
@Override
@Nullable
public InputSource resolveEntity(String publicId, @Nullable String systemId) throws IOException {
    if (logger.isTraceEnabled()) {
        logger.trace("Trying to resolve XML entity with public id [" + publicId +
                "] and system id [" + systemId + "]");
    }
    if (systemId != null) {
        // 获得 Resource 所在位置
        String resourceLocation = getSchemaMappings().get(systemId);
        if (resourceLocation != null) {
            // 创建 ClassPathResource
            Resource resource = new ClassPathResource(resourceLocation, this.classLoader);
            try {
                // 创建 InputSource 对象,并设置 publicId、systemId 属性
                InputSource source = new InputSource(resource.getInputStream());
                source.setPublicId(publicId);
                source.setSystemId(systemId);
                if (logger.isTraceEnabled()) {
                    logger.trace("Found XML schema [" + systemId + "] in classpath: " + resourceLocation);
                }
                return source;
            }
            catch (FileNotFoundException ex) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Could not find XML schema [" + systemId + "]: " + resource, ex);
                }
            }
        }
    }
    return null;
}
  • 首先调用 #getSchemaMappings() 方法,获取一个映射表( systemId 与其在本地的对照关系)。代码如下:
private Map<String, String> getSchemaMappings() {
    Map<String, String> schemaMappings = this.schemaMappings;
    // DCL双重检查锁,实现 schemaMappings 单例
    if (schemaMappings == null) {
        synchronized (this) {
            schemaMappings = this.schemaMappings;
            if (schemaMappings == null) {
                if (logger.isTraceEnabled()) {
                    logger.trace("Loading schema mappings from [" + this.schemaMappingsLocation + "]");
                }
                try {
                    // 以 Properties 的方式,读取 schemaMappingsLocation
                    Properties mappings = PropertiesLoaderUtils.loadAllProperties(this.schemaMappingsLocation, this.classLoader);
                    if (logger.isTraceEnabled()) {
                        logger.trace("Loaded schema mappings: " + mappings);
                    }
                    // 将 mappings 初始化到 schemaMappings 中
                    schemaMappings = new ConcurrentHashMap<>(mappings.size());
                    CollectionUtils.mergePropertiesIntoMap(mappings, schemaMappings);
                    this.schemaMappings = schemaMappings;
                } catch (IOException ex) {
                    throw new IllegalStateException(
                            "Unable to load schema mappings from location [" + this.schemaMappingsLocation + "]", ex);
                }
            }
        }
    }
    return schemaMappings;
}
  • 然后,根据传入的 systemId 获取该 systemId 在本地的路径 resourceLocation
  • 最后,根据 resourceLocation ,构造 InputSource 对象。

4.6 ResourceEntityResolver

ResourceEntityResolver 的解析过程,代码如下:

private final ResourceLoader resourceLoader;
@Override
@Nullable
public InputSource resolveEntity(String publicId, @Nullable String systemId) throws SAXException, IOException {
    // 调用父类的方法,进行解析
    InputSource source = super.resolveEntity(publicId, systemId);
    // 解析失败,resourceLoader 进行解析
    if (source == null && systemId != null) {
        // 获得 resourcePath ,即 Resource 资源地址
        String resourcePath = null;
        try {
            String decodedSystemId = URLDecoder.decode(systemId, "UTF-8"); // 使用 UTF-8 ,解码 systemId
            String givenUrl = new URL(decodedSystemId).toString(); // 转换成 URL 字符串
            // 解析文件资源的相对路径(相对于系统根路径)
            String systemRootUrl = new File("").toURI().toURL().toString();
            // Try relative to resource base if currently in system root.
            if (givenUrl.startsWith(systemRootUrl)) {
                resourcePath = givenUrl.substring(systemRootUrl.length());
            }
        } catch (Exception ex) {
            // Typically a MalformedURLException or AccessControlException.
            if (logger.isDebugEnabled()) {
                logger.debug("Could not resolve XML entity [" + systemId + "] against system root URL", ex);
            }
            // No URL (or no resolvable URL) -> try relative to resource base.
            resourcePath = systemId;
        }
        if (resourcePath != null) {
            if (logger.isTraceEnabled()) {
                logger.trace("Trying to locate XML entity [" + systemId + "] as resource [" + resourcePath + "]");
            }
            // 获得 Resource 资源
            Resource resource = this.resourceLoader.getResource(resourcePath);
            // 创建 InputSource 对象
            source = new InputSource(resource.getInputStream());
            // 设置 publicId 和 systemId 属性
            source.setPublicId(publicId);
            source.setSystemId(systemId);
            if (logger.isDebugEnabled()) {
                logger.debug("Found XML entity [" + systemId + "]: " + resource);
            }
        }
    }
    return source;
}
  • 首先,调用父类的方法,进行解析。
  • 如果失败,使用 resourceLoader ,尝试读取 systemId 对应的 Resource 资源。

备注:文章参考了艿艿和小明哥的 spring 源码解析,笔者也是艿艿和小明哥的忠实读者


文章作者: AnonyStar
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 AnonyStar !
评论
 上一篇
Spring Data Jpa [1] Spring Data Jpa [1]
本文主要讲解 `springData Jpa` 入门相关知识, 了解JPA规范与Jpa的实现,搭建springboot+dpringdata jpa环境实现基础增删改操作,适合新手学习,老鸟绕道~
2020-08-28
下一篇 
springioc之验证模型获取 springioc之验证模型获取
我们这篇文章主要以 `#getValidationModeForResource(...)` 方法作为切入,来分析一下验证模型的主要方法,关于spring中的数据验证、绑定等内容我们在后面文章一点点的来挖掘
2020-07-08
  目录