当前位置: 首页 > news >正文

家庭装修设计软件哪个好用seo网站有优化培训吗

家庭装修设计软件哪个好用,seo网站有优化培训吗,wordpress php 5.2.17,商城网站制作公司地址文章目录 1、基本操作1.1、创建SparkSession1.2、创建DataFrames1.3、创建Dataset操作1.4、运行sql查询1.5、创建全局临时视图1.6、创建Datasets1.7、与rdd进行互操作1.7.1、使用反射推断模式1.7.2、以编程方式指定模式 2、完整的测试例子 1、基本操作 1.1、创建SparkSession …

文章目录

    • 1、基本操作
      • 1.1、创建SparkSession
      • 1.2、创建DataFrames
      • 1.3、创建Dataset操作
      • 1.4、运行sql查询
      • 1.5、创建全局临时视图
      • 1.6、创建Datasets
      • 1.7、与rdd进行互操作
        • 1.7.1、使用反射推断模式
        • 1.7.2、以编程方式指定模式
    • 2、完整的测试例子

1、基本操作

1.1、创建SparkSession

import org.apache.spark.sql.SparkSession;SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.getOrCreate();

1.2、创建DataFrames

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;Dataset<Row> df = spark.read().json("examples/src/main/resources/people.json");// Displays the content of the DataFrame to stdout
df.show();
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+

1.3、创建Dataset操作

// col("...") is preferable to df.col("...")
import static org.apache.spark.sql.functions.col;// Print the schema in a tree format
df.printSchema();
// root
// |-- age: long (nullable = true)
// |-- name: string (nullable = true)// Select only the "name" column
df.select("name").show();
// +-------+
// |   name|
// +-------+
// |Michael|
// |   Andy|
// | Justin|
// +-------+// Select everybody, but increment the age by 1
df.select(col("name"), col("age").plus(1)).show();
// +-------+---------+
// |   name|(age + 1)|
// +-------+---------+
// |Michael|     null|
// |   Andy|       31|
// | Justin|       20|
// +-------+---------+// Select people older than 21
df.filter(col("age").gt(21)).show();
// +---+----+
// |age|name|
// +---+----+
// | 30|Andy|
// +---+----+// Count people by age
df.groupBy("age").count().show();
// +----+-----+
// | age|count|
// +----+-----+
// |  19|    1|
// |null|    1|
// |  30|    1|
// +----+-----+

1.4、运行sql查询

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;// Register the DataFrame as a SQL temporary view
df.createOrReplaceTempView("people");Dataset<Row> sqlDF = spark.sql("SELECT * FROM people");
sqlDF.show();
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+

1.5、创建全局临时视图

// Register the DataFrame as a global temporary view
df.createGlobalTempView("people");// Global temporary view is tied to a system preserved database `global_temp`
spark.sql("SELECT * FROM global_temp.people").show();
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+// Global temporary view is cross-session
spark.newSession().sql("SELECT * FROM global_temp.people").show();
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+

1.6、创建Datasets

import java.util.Arrays;
import java.util.Collections;
import java.io.Serializable;import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.Encoder;
import org.apache.spark.sql.Encoders;public static class Person implements Serializable {private String name;private long age;public String getName() {return name;}public void setName(String name) {this.name = name;}public long getAge() {return age;}public void setAge(long age) {this.age = age;}
}// Create an instance of a Bean class
Person person = new Person();
person.setName("Andy");
person.setAge(32);// Encoders are created for Java beans
Encoder<Person> personEncoder = Encoders.bean(Person.class);
Dataset<Person> javaBeanDS = spark.createDataset(Collections.singletonList(person),personEncoder
);
javaBeanDS.show();
// +---+----+
// |age|name|
// +---+----+
// | 32|Andy|
// +---+----+// Encoders for most common types are provided in class Encoders
Encoder<Long> longEncoder = Encoders.LONG();
Dataset<Long> primitiveDS = spark.createDataset(Arrays.asList(1L, 2L, 3L), longEncoder);
Dataset<Long> transformedDS = primitiveDS.map((MapFunction<Long, Long>) value -> value + 1L,longEncoder);
transformedDS.collect(); // Returns [2, 3, 4]// DataFrames can be converted to a Dataset by providing a class. Mapping based on name
String path = "examples/src/main/resources/people.json";
Dataset<Person> peopleDS = spark.read().json(path).as(personEncoder);
peopleDS.show();
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+

1.7、与rdd进行互操作

1.7.1、使用反射推断模式

Spark SQL支持将JavaBeans的RDD自动转换为DataFrame。使用反射获得的BeanInfo定义了表的模式。目前,Spark SQL不支持包含Map字段的JavaBeans。但是支持嵌套JavaBeans和List或Array字段。您可以通过创建一个实现Serializable的类来创建JavaBean,并且该类的所有字段都有getter和setter。

import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.Encoder;
import org.apache.spark.sql.Encoders;// Create an RDD of Person objects from a text file
JavaRDD<Person> peopleRDD = spark.read().textFile("examples/src/main/resources/people.txt").javaRDD().map(line -> {String[] parts = line.split(",");Person person = new Person();person.setName(parts[0]);person.setAge(Integer.parseInt(parts[1].trim()));return person;});// Apply a schema to an RDD of JavaBeans to get a DataFrame
Dataset<Row> peopleDF = spark.createDataFrame(peopleRDD, Person.class);
// Register the DataFrame as a temporary view
peopleDF.createOrReplaceTempView("people");// SQL statements can be run by using the sql methods provided by spark
Dataset<Row> teenagersDF = spark.sql("SELECT name FROM people WHERE age BETWEEN 13 AND 19");// The columns of a row in the result can be accessed by field index
Encoder<String> stringEncoder = Encoders.STRING();
Dataset<String> teenagerNamesByIndexDF = teenagersDF.map((MapFunction<Row, String>) row -> "Name: " + row.getString(0),stringEncoder);
teenagerNamesByIndexDF.show();
// +------------+
// |       value|
// +------------+
// |Name: Justin|
// +------------+// or by field name
Dataset<String> teenagerNamesByFieldDF = teenagersDF.map((MapFunction<Row, String>) row -> "Name: " + row.<String>getAs("name"),stringEncoder);
teenagerNamesByFieldDF.show();
// +------------+
// |       value|
// +------------+
// |Name: Justin|
// +------------+
1.7.2、以编程方式指定模式

当JavaBean类不能提前定义时(例如,记录的结构被编码为字符串,或者文本数据集将被解析,字段将以不同的方式投影给不同的用户),可以通过三个步骤以编程方式创建dataset 。

  • 从原始RDD的行创建一个RDD;
  • 创建由StructType表示的模式,该模式与步骤1中创建的RDD中的Rows结构相匹配。
  • 通过SparkSession提供的createDataFrame方法将模式应用到RDD的行。
import java.util.ArrayList;
import java.util.List;import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.function.Function;import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;import org.apache.spark.sql.types.DataTypes;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;// Create an RDD
JavaRDD<String> peopleRDD = spark.sparkContext().textFile("examples/src/main/resources/people.txt", 1).toJavaRDD();// The schema is encoded in a string
String schemaString = "name age";// Generate the schema based on the string of schema
List<StructField> fields = new ArrayList<>();
for (String fieldName : schemaString.split(" ")) {StructField field = DataTypes.createStructField(fieldName, DataTypes.StringType, true);fields.add(field);
}
StructType schema = DataTypes.createStructType(fields);// Convert records of the RDD (people) to Rows
JavaRDD<Row> rowRDD = peopleRDD.map((Function<String, Row>) record -> {String[] attributes = record.split(",");return RowFactory.create(attributes[0], attributes[1].trim());
});// Apply the schema to the RDD
Dataset<Row> peopleDataFrame = spark.createDataFrame(rowRDD, schema);// Creates a temporary view using the DataFrame
peopleDataFrame.createOrReplaceTempView("people");// SQL can be run over a temporary view created using DataFrames
Dataset<Row> results = spark.sql("SELECT name FROM people");// The results of SQL queries are DataFrames and support all the normal RDD operations
// The columns of a row in the result can be accessed by field index or by field name
Dataset<String> namesDS = results.map((MapFunction<Row, String>) row -> "Name: " + row.getString(0),Encoders.STRING());
namesDS.show();
// +-------------+
// |        value|
// +-------------+
// |Name: Michael|
// |   Name: Andy|
// | Name: Justin|
// +-------------+

2、完整的测试例子

本例子代码是在window下测试,需要下载https://github.com/steveloughran/winutils,解压放在hadoop对应目录

package com.penngo.spark;import org.apache.log4j.Level;
import org.apache.log4j.Logger;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.*;
import org.apache.spark.sql.types.DataTypes;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;import java.io.Serializable;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;import static org.apache.spark.sql.functions.col;public class SparkDataset {private static final String jsonPath = "D:\\hadoop\\spark\\resources\\people.json";private static final String txtPath = "D:\\hadoop\\spark\\resources\\people.txt";public static class Person implements Serializable {private String name;private long age;public String getName() {return name;}public void setName(String name) {this.name = name;}public long getAge() {return age;}public void setAge(long age) {this.age = age;}}public static void createDataFrame(SparkSession spark) throws Exception{// 创建DataFrameDataset<Row> df = spark.read().json(jsonPath);df.show();// 操作operations(df);// sql查询sqlQuery(spark, df);}public static void operations(Dataset<Row> df){df.printSchema();// root// |-- age: long (nullable = true)// |-- name: string (nullable = true)// Select only the "name" columndf.select("name").show();// +-------+// |   name|// +-------+// |Michael|// |   Andy|// | Justin|// +-------+// Select everybody, but increment the age by 1df.select(col("name"), col("age").plus(1)).show();// +-------+---------+// |   name|(age + 1)|// +-------+---------+// |Michael|     null|// |   Andy|       31|// | Justin|       20|// +-------+---------+// Select people older than 21df.filter(col("age").gt(21)).show();// +---+----+// |age|name|// +---+----+// | 30|Andy|// +---+----+// Count people by agedf.groupBy("age").count().show();// +----+-----+// | age|count|// +----+-----+// |  19|    1|// |null|    1|// |  30|    1|// +----+-----+}/*** SQL查询*/public static void sqlQuery(SparkSession spark, Dataset<Row> df) throws Exception{// 临时视图,会话消失,视图也会消失df.createOrReplaceTempView("people");Dataset<Row> sqlDF = spark.sql("SELECT * FROM people");sqlDF.show();// 全局视图,全局临时视图绑定到系统保留的数据库' global_temp 'df.createGlobalTempView("people");spark.sql("SELECT * FROM global_temp.people").show();// +----+-------+// | age|   name|// +----+-------+// |null|Michael|// |  30|   Andy|// |  19| Justin|// +----+-------+// 全局临时视图是跨会话的spark.newSession().sql("SELECT * FROM global_temp.people").show();// +----+-------+// | age|   name|// +----+-------+// |null|Michael|// |  30|   Andy|// |  19| Justin|// +----+-------+}public static void createDataset(SparkSession spark){// 列表转成datasetPerson person = new Person();person.setName("Andy");person.setAge(32);Encoder<Person> personEncoder = Encoders.bean(Person.class);Dataset<Person> javaBeanDS = spark.createDataset(Collections.singletonList(person),personEncoder);System.out.println("createDataset show");javaBeanDS.show();// +---+----+// |age|name|// +---+----+// | 32|Andy|// +---+----+Encoder<Long> longEncoder = Encoders.LONG();Dataset<Long> primitiveDS = spark.createDataset(Arrays.asList(1L, 2L, 3L), longEncoder);Dataset<Long> transformedDS = primitiveDS.map((MapFunction<Long, Long>) value -> value + 1L,longEncoder);transformedDS.collect(); // Returns [2, 3, 4]// 读取文件转成datasetDataset<Person> peopleDS = spark.read().json(jsonPath).as(personEncoder);peopleDS.show();// +----+-------+// | age|   name|// +----+-------+// |null|Michael|// |  30|   Andy|// |  19| Justin|// +----+-------+}/*** 非Bean的方式转换:rdd->DataFrame->Dataset* @param spark* @throws Exception*/public static void rddToDataset(SparkSession spark) throws Exception{// 读取文件生成一个Person类型的RDDJavaRDD<Person> peopleRDD = spark.read().textFile(txtPath).javaRDD().map(line -> {String[] parts = line.split(",");Person person = new Person();person.setName(parts[0]);person.setAge(Integer.parseInt(parts[1].trim()));return person;});// RDD转成DataFrameDataset<Row> peopleDF = spark.createDataFrame(peopleRDD, Person.class);// 把DataFrame注册为临时视图peopleDF.createOrReplaceTempView("people");// SQL语句可以通过spark提供的SQL方法来运行Dataset<Row> teenagersDF = spark.sql("SELECT name FROM people WHERE age BETWEEN 13 AND 19");// 结果中一行的列可以通过字段索引访问Encoder<String> stringEncoder = Encoders.STRING();Dataset<String> teenagerNamesByIndexDF = teenagersDF.map((MapFunction<Row, String>) row -> "Name: " + row.getString(0),stringEncoder);teenagerNamesByIndexDF.show();// +------------+// |       value|// +------------+// |Name: Justin|// +------------+// 也可以通过字段名访问Dataset<String> teenagerNamesByFieldDF = teenagersDF.map((MapFunction<Row, String>) row -> "Name: " + row.<String>getAs("name"),stringEncoder);teenagerNamesByFieldDF.show();// +------------+// |       value|// +------------+// |Name: Justin|// +------------+}/*** 非Bean的方式转换:rdd->DataFrame->Dataset* @param spark* @throws Exception*/public static void rddToDataset2(SparkSession spark) throws Exception{// 创建RDDJavaRDD<String> peopleRDD = spark.sparkContext().textFile(txtPath, 1).toJavaRDD();// 字段字义String schemaString = "name age";// 根据schema的字符串生成schemaList<StructField> fields = new ArrayList<>();for (String fieldName : schemaString.split(" ")) {StructField field = DataTypes.createStructField(fieldName, DataTypes.StringType, true);fields.add(field);}StructType schema = DataTypes.createStructType(fields);// 将RDD(people)的记录转换为视图的RowJavaRDD<Row> rowRDD = peopleRDD.map((Function<String, Row>) record -> {String[] attributes = record.split(",");return RowFactory.create(attributes[0], attributes[1].trim());});// 将schema应用于RDD,转为DataFrameDataset<Row> peopleDataFrame = spark.createDataFrame(rowRDD, schema);// 使用DataFrame创建临时视图peopleDataFrame.createOrReplaceTempView("people");// SQL可以在使用dataframe创建的临时视图上运行Dataset<Row> results = spark.sql("SELECT name FROM people");// SQL查询的结果是dataframe,支持所有正常的RDD操作// 结果行的列可以通过字段索引或字段名称访问Dataset<String> namesDS = results.map((MapFunction<Row, String>) row -> "Name: " + row.getString(0),Encoders.STRING());namesDS.show();// +-------------+// |        value|// +-------------+// |Name: Michael|// |   Name: Andy|// | Name: Justin|// +-------------+}public static void main(String[] args) throws Exception{Logger.getLogger("org.apache.spark").setLevel(Level.WARN);Logger.getLogger("org.apache.eclipse.jetty.server").setLevel(Level.OFF);//windows下调试spark需要使用https://github.com/steveloughran/winutilsSystem.setProperty("hadoop.home.dir", "D:\\hadoop\\hadoop-3.3.1");System.setProperty("HADOOP_USER_NAME", "root");SparkSession spark = SparkSession.builder().appName("SparkDataset").master("local[*]").getOrCreate();createDataFrame(spark);createDataset(spark);rddToDataset(spark);rddToDataset2(spark);spark.stop();}
}

参考自官方文档:https://spark.apache.org/docs/3.1.2/sql-getting-started.html
spark支持数据源:https://spark.apache.org/docs/3.1.2/sql-data-sources.html
spark sql语法相关:https://spark.apache.org/docs/3.1.2/sql-ref.html


文章转载自:
http://dinncounderinflated.bkqw.cn
http://dinncocontemplator.bkqw.cn
http://dinncohydratable.bkqw.cn
http://dinncoeloquently.bkqw.cn
http://dinncocaffeine.bkqw.cn
http://dinncodemonophobia.bkqw.cn
http://dinncopaedogenesis.bkqw.cn
http://dinncosilvern.bkqw.cn
http://dinncoidumaean.bkqw.cn
http://dinncotripterous.bkqw.cn
http://dinncotabernacular.bkqw.cn
http://dinncofreshman.bkqw.cn
http://dinncohortator.bkqw.cn
http://dinncopancytopenia.bkqw.cn
http://dinncoinauthentic.bkqw.cn
http://dinncouncorrected.bkqw.cn
http://dinncoplagiocephalic.bkqw.cn
http://dinncocastroism.bkqw.cn
http://dinncoforeface.bkqw.cn
http://dinncofootrest.bkqw.cn
http://dinncoreproachful.bkqw.cn
http://dinncogelsenkirchen.bkqw.cn
http://dinncochokebore.bkqw.cn
http://dinncopropinquity.bkqw.cn
http://dinncohopeless.bkqw.cn
http://dinncometencephalon.bkqw.cn
http://dinncorevers.bkqw.cn
http://dinncovitriform.bkqw.cn
http://dinncochurrigueresque.bkqw.cn
http://dinncooverstatement.bkqw.cn
http://dinncodebutante.bkqw.cn
http://dinncobladebone.bkqw.cn
http://dinncoepicarp.bkqw.cn
http://dinncochronograph.bkqw.cn
http://dinncoknightlike.bkqw.cn
http://dinncochangeroom.bkqw.cn
http://dinncoconservative.bkqw.cn
http://dinncobestrid.bkqw.cn
http://dinncodevote.bkqw.cn
http://dinncoandean.bkqw.cn
http://dinncounderpainting.bkqw.cn
http://dinncocompressional.bkqw.cn
http://dinncolancinating.bkqw.cn
http://dinncostuka.bkqw.cn
http://dinncopenghu.bkqw.cn
http://dinncoseafood.bkqw.cn
http://dinncoantiphrasis.bkqw.cn
http://dinncodyslexic.bkqw.cn
http://dinncosuggestive.bkqw.cn
http://dinncoevocation.bkqw.cn
http://dinncotritely.bkqw.cn
http://dinncoairworthy.bkqw.cn
http://dinncorented.bkqw.cn
http://dinncocosmogenetic.bkqw.cn
http://dinnconotifiable.bkqw.cn
http://dinncobilestone.bkqw.cn
http://dinncovestibular.bkqw.cn
http://dinncosuffocate.bkqw.cn
http://dinncoleisurable.bkqw.cn
http://dinncomsn.bkqw.cn
http://dinncoscholium.bkqw.cn
http://dinncogimmal.bkqw.cn
http://dinncobandsaw.bkqw.cn
http://dinncopursily.bkqw.cn
http://dinncocher.bkqw.cn
http://dinncoaspirer.bkqw.cn
http://dinncoindianization.bkqw.cn
http://dinncoidea.bkqw.cn
http://dinncobombay.bkqw.cn
http://dinncosilence.bkqw.cn
http://dinncofuthark.bkqw.cn
http://dinncoglutinous.bkqw.cn
http://dinncopurpureal.bkqw.cn
http://dinncogurmukhi.bkqw.cn
http://dinncopreterlegal.bkqw.cn
http://dinncometalogue.bkqw.cn
http://dinncopurchaser.bkqw.cn
http://dinncoreproachingly.bkqw.cn
http://dinncofarina.bkqw.cn
http://dinncojobation.bkqw.cn
http://dinncobedsettee.bkqw.cn
http://dinncocommuterdom.bkqw.cn
http://dinncoyancey.bkqw.cn
http://dinncovitligo.bkqw.cn
http://dinncosulfaguanidine.bkqw.cn
http://dinncoxvi.bkqw.cn
http://dinncoreman.bkqw.cn
http://dinncohariana.bkqw.cn
http://dinncosirrah.bkqw.cn
http://dinncowarship.bkqw.cn
http://dinncohumiliator.bkqw.cn
http://dinncounipetalous.bkqw.cn
http://dinncokasai.bkqw.cn
http://dinncodiaphorase.bkqw.cn
http://dinncorushwork.bkqw.cn
http://dinncouncover.bkqw.cn
http://dinncochingkang.bkqw.cn
http://dinncoojt.bkqw.cn
http://dinncoformula.bkqw.cn
http://dinnconutshell.bkqw.cn
http://www.dinnco.com/news/113027.html

相关文章:

  • wordpress自定义表百度搜索怎么优化
  • 烟台做网站价格苏州关键词搜索排名
  • 灰色行业网站百度竞价排名怎么靠前
  • 模板建站合同郑州有没有厉害的seo顾问
  • 微网站搭建平台深圳正规seo
  • 成都网站建设seo成都网络优化公司有哪些
  • 网站可访问性焦作网络推广哪家好
  • 一个刚做好的网站怎么做seo新冠疫情最新数据
  • 银川网站建设怎么样营销新闻
  • 东莞做网页外包seo服务口碑好
  • html中音乐网站怎么做泰安百度推广代理商
  • 用凡科帮别人做网站百度投诉中心24人工客服
  • 安徽省建设工程八大员报名网站谷歌浏览器手机版下载
  • 做金融的免费发帖的网站有哪些推广优化seo
  • 亚马逊一级二级三级类目表关键词优化公司排名榜
  • 坪地网站建设服务项目点击器
  • 青岛网站关键词阿里云搜索引擎
  • 服装工厂做网站的好处十大免费cms建站系统介绍
  • 网站推广有必要吗上海网络推广软件
  • 河南省做网站的公司有哪些济南seo公司
  • 怎么描述网站seo挂机赚钱
  • 广州黄浦区建设局网站网站建设方案及报价
  • 广汉市建设局官方网站以下属于网站seo的内容是
  • 云盘网站建设怎么在百度做广告
  • 建设网站号码是多少营销方案网站
  • 免费软件看电视剧seopc流量排行榜企业
  • 南宁电子商务网站建设徐州百度推广
  • 哪些网站可以做图片链接合肥关键词排名优化
  • 宣传软文模板爱站seo工具
  • 做善事的网站百度指数官方下载