UserScan的处理流程分析
前置说明
Userscan是通过client或cp中发起的scanner操作。
在Scan中通过caching属性来返回可以返回多少条数据,每次进行next时。
通过batch属性来设置每次在rs端每次next kv时,可读取多少个kv,(在同一行的情况下)
在生成Scan实例时,最好是把family与column都设置上,这样能保证查询的最高效.
client端通过生成Scan实例,通过HTable下的如下方法得到ClientScanner实例
public ResultScanner getScanner(final Scan scan)
在生成的ClientScanner实例中的callable属性的值为生成的一个ScannerCallable实例。
并通过callable.prepare(tries != 0);方法得到此scan的startkey所在的region的location.在meta表中。
把startkey对应的location中得到此location的HRegionInfo信息。
并设置ClientScanner.currentRegion的值为当前的region.也就是startkey所在的region.
通过ClientScanner.next向rs发起rpc调用操作。调用HRegionServer.scan
public ScanResponse scan(finalRpcControllercontroller, final ScanRequest request)
ClientScanner.next时,首先是发起openScanner操作,得到一个ScannerId
通过ScannerCallable.call方法:
if (scannerId == -1L) {
this.scannerId = openScanner();
} else {
openScanner方法:中发起一个scan操作,通过rpc调用rs.scan
ScanRequest request =
RequestConverter.buildScanRequest(
getLocation().getRegionInfo().getRegionName(),
this.scan, 0, false);
try {
ScanResponse response = getStub().scan(null, request);
longid = response.getScannerId();
if (logScannerActivity) {
LOG.info("Open scanner=" + id + " for scan=" + scan.toString()
+ " on region " + getLocation().toString());
}
returnid;
HregionServer.scan中对openScanner的处理:
public ScanResponse scan(finalRpcControllercontroller, final ScanRequest request)
throws ServiceException {
Leases.Lease lease = null;
String scannerName = null;
........................................很多代码没有显示
requestCount.increment();
intttl = 0;
HRegion region = null;
RegionScannerscanner = null;
RegionScannerHolder rsh = null;
booleanmoreResults = true;
booleancloseScanner = false;
ScanResponse.Builder builder = ScanResponse.newBuilder();
if (request.hasCloseScanner()) {
closeScanner = request.getCloseScanner();
}
introws = 1;
if (request.hasNumberOfRows()) {
rows = request.getNumberOfRows();
}
if (request.hasScannerId()) {
.................................很多代码没有显示
} else {
得到请求的HRegion实例,也就是startkey所在的HRegion
region = getRegion(request.getRegion());
ClientProtos.Scan protoScan = request.getScan();
booleanisLoadingCfsOnDemandSet = protoScan.hasLoadColumnFamiliesOnDemand();
Scan scan = ProtobufUtil.toScan(protoScan);
// if the request doesn't set this, get the default region setting.
if (!isLoadingCfsOnDemandSet) {
scan.setLoadColumnFamiliesOnDemand(region.isLoadingCfsOnDemandDefault());
}
scan.getAttribute(Scan.SCAN_ATTRIBUTES_METRICS_ENABLE);
如果scan没有设置family,把region中所有的family当成scan的family
region.prepareScanner(scan);
if (region.getCoprocessorHost() != null) {
scanner = region.getCoprocessorHost().preScannerOpen(scan);
}
if (scanner == null) {
执行HRegion.getScanner方法。生成HRegion.RegionScannerImpl方法
scanner = region.getScanner(scan);
}
if (region.getCoprocessorHost() != null) {
scanner = region.getCoprocessorHost().postScannerOpen(scan, scanner);
}
把生成的RegionScanner添加到scanners集合容器中。并设置scannerid(一个随机的值),
scannername是scannerid的string版本。添加过期监控处理,
通过hbase.client.scanner.timeout.period配置过期时间,默认值为60000ms
老版本通过hbase.regionserver.lease.period配置。
过期检查线程通过Leases完成。对scanner的过期处理通过一个
HregionServer.ScannerListener.leaseExpired实例来完成。
scannerId = addScanner(scanner, region);
scannerName = String.valueOf(scannerId);
ttl = this.scannerLeaseTimeoutPeriod;
}
............................................很多代码没有显示
Hregion.getScanner方法生成RegionScanner实例流程
publicRegionScannergetScanner(Scan scan) throws IOException {
returngetScanner(scan, null);
}
层次的调用,此时传入的kvscannerlist为null
protectedRegionScannergetScanner(Scan scan,
List<KeyValueScanner> additionalScanners) throws IOException {
startRegionOperation(Operation.SCAN);
try {
// Verify families are all valid
prepareScanner(scan);
if(scan.hasFamilies()) {
for(byte [] family : scan.getFamilyMap().keySet()) {
checkFamily(family);
}
}
returninstantiateRegionScanner(scan, additionalScanners);
} finally {
closeRegionOperation();
}
}
最终生成一个HRegion.RegionScannerImpl实例
protectedRegionScannerinstantiateRegionScanner(Scan scan,
List<KeyValueScanner> additionalScanners) throws IOException {
returnnewRegionScannerImpl(scan, additionalScanners, this);
}
RegionScanner实例的生成构造方法:
RegionScannerImpl(Scan scan, List<KeyValueScanner> additionalScanners, HRegion region)
throws IOException {
this.region = region;
this.maxResultSize = scan.getMaxResultSize();
if (scan.hasFilter()) {
this.filter = newFilterWrapper(scan.getFilter());
} else {
this.filter = null;
}
this.batch = scan.getBatch();
if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && !scan.isGetScan()) {
this.stopRow = null;
} else {
this.stopRow = scan.getStopRow();
}
// If we are doing a get, we want to be [startRow,endRow] normally
// it is [startRow,endRow) and if startRow=endRow we get nothing.
this.isScan = scan.isGetScan() ? -1 : 0;
// synchronize on scannerReadPoints so that nobody calculates
// getSmallestReadPoint, before scannerReadPoints is updated.
IsolationLevelisolationLevel = scan.getIsolationLevel();
synchronized(scannerReadPoints) {
if (isolationLevel == IsolationLevel.READ_UNCOMMITTED) {
// This scan can read even uncommitted transactions
this.readPt = Long.MAX_VALUE;
MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);
} else {
this.readPt = MultiVersionConsistencyControl.resetThreadReadPoint(mvcc);
}
scannerReadPoints.put(this, this.readPt);
}
// Here we separate all scanners into two lists - scanner that provide data required
// by the filter to operate (scanners list) and all others (joinedScanners list).
List<KeyValueScanner> scanners = newArrayList<KeyValueScanner>();
List<KeyValueScanner> joinedScanners = newArrayList<KeyValueScanner>();
if (additionalScanners != null) {
scanners.addAll(additionalScanners);
}
迭代每一个要进行scan的store。生成具体的StoreScanner实例。通常情况下joinedHead的值为null
for (Map.Entry<byte[], NavigableSet<byte[]>> entry :
scan.getFamilyMap().entrySet()) {
Storestore = stores.get(entry.getKey());
生成StoreScanner实例。通过HStore.getScanner(scan,columns);
KeyValueScannerscanner = store.getScanner(scan, entry.getValue());
if (this.filter == null || !scan.doLoadColumnFamiliesOnDemand()
|| this.filter.isFamilyEssential(entry.getKey())) {
scanners.add(scanner);
} else {
joinedScanners.add(scanner);
}
}
生成KeyValueHeap实例,把所有的storescanner的开始位置移动到startkey的位置并得到top的StoreScanner,
this.storeHeap = newKeyValueHeap(scanners, comparator);
if (!joinedScanners.isEmpty()) {
this.joinedHeap = newKeyValueHeap(joinedScanners, comparator);
}
}
得到StoreScanner实例的HStore.getScanner(scan,columns)方法
publicKeyValueScannergetScanner(Scan scan,
finalNavigableSet<byte []> targetCols) throws IOException {
lock.readLock().lock();
try {
KeyValueScannerscanner = null;
if (this.getCoprocessorHost() != null) {
scanner = this.getCoprocessorHost().preStoreScannerOpen(this, scan, targetCols);
}
if (scanner == null) {
scanner = newStoreScanner(this, getScanInfo(), scan, targetCols);
}
returnscanner;
} finally {
lock.readLock().unlock();
}
}
生成StoreScanner的构造方法:
publicStoreScanner(Storestore, ScanInfo scanInfo, Scan scan, finalNavigableSet<byte[]> columns)
throws IOException {
this(store, scan.getCacheBlocks(), scan, columns, scanInfo.getTtl(),
scanInfo.getMinVersions());
如果设置有scan的_raw_属性时,columns的值需要为null
if (columns != null && scan.isRaw()) {
thrownewDoNotRetryIOException(
"Cannot specify any column for a raw scan");
}
matcher = newScanQueryMatcher(scan, scanInfo, columns,
ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
oldestUnexpiredTS);
得到StoreFileScanner,StoreFileScanner中引用的StoreFile.Reader中引用HFileReaderV2,
HFileReaderV2的实例在StoreFile.Reader中如果已经存在,不会重新创建,这样会加快scanner的创建时间。
// Pass columns to try to filter out unnecessary StoreFiles.
List<KeyValueScanner> scanners = getScannersNoCompaction();
// Seek all scanners to the start of the Row (or if the exact matching row
// key does not exist, then to the start of the next matching Row).
// Always check bloom filter to optimize the top row seek for delete
// family marker.
if (explicitColumnQuery && lazySeekEnabledGlobally) {
for (KeyValueScannerscanner : scanners) {
scanner.requestSeek(matcher.getStartKey(), false, true);
}
} else {
if (!isParallelSeekEnabled) {
for (KeyValueScannerscanner : scanners) {
scanner.seek(matcher.getStartKey());
}
} else {
parallelSeek(scanners, matcher.getStartKey());
}
}
// set storeLimit
this.storeLimit = scan.getMaxResultsPerColumnFamily();
// set rowOffset
this.storeOffset = scan.getRowOffsetPerColumnFamily();
// Combine all seeked scanners with a heap
heap = newKeyValueHeap(scanners, store.getComparator());
注册,如果有storefile更新时,把更新后的storefile添加到这个StoreScanner中来。
this.store.addChangedReaderObserver(this);
}
发起scan的rpc操作
client端发起openScanner操作后,得到一个scannerId.此时发起scan操作。
通过ScannerCallable.call中发起call的操作,在scannerId不等于-1时,
Result [] rrs = null;
ScanRequest request = null;
try {
incRPCcallsMetrics();
request = RequestConverter.buildScanRequest(scannerId, caching, false, nextCallSeq);
ScanResponse response = null;
PayloadCarryingRpcController controller = newPayloadCarryingRpcController();
try {
controller.setPriority(getTableName());
response = getStub().scan(controller, request);
...................................此处省去一些代码
nextCallSeq++;
longtimestamp = System.currentTimeMillis();
// Results are returned via controller
CellScannercellScanner = controller.cellScanner();
rrs = ResponseConverter.getResults(cellScanner, response);
HregionServer.scan方法中对scan时的处理流程:
得到scan中的caching属性的值,此值主要用来响应client返回的条数。如果一行数据包含多个kv,算一条
introws = 1;
if (request.hasNumberOfRows()) {
rows = request.getNumberOfRows();
}
如果client传入的scannerId有值,也就是不等于-1时,表示不是openScanner操作,检查scannerid是否过期
if (request.hasScannerId()) {
rsh = scanners.get(scannerName);
if (rsh == null) {
LOG.info("Client tried to access missing scanner " + scannerName);
thrownewUnknownScannerException(
"Name: " + scannerName + ", already closed?");
}
此处主要是检查region是否发生过split操作。如果是会出现NotServingRegionException操作。
scanner = rsh.s;
HRegionInfo hri = scanner.getRegionInfo();
region = getRegion(hri.getRegionName());
if (region != rsh.r) { // Yes, should be the same instance
thrownewNotServingRegionException("Region was re-opened after the scanner"
+ scannerName + " was created: " + hri.getRegionNameAsString());
}
} else {
...................................此处省去一些生成Regionscanner的代码
}
表示有设置caching,如果是执行scan,此时的默认值为1,当前scan中设置有caching后,使用scan中设置的值
if (rows > 0) {
// if nextCallSeq does not match throw Exception straight away. This needs to be
// performed even before checking of Lease.
// See HBASE-5974
是否有配置nextCallSeq的值,第一次调用时,此值为0,每调用一次加一,client也一样,每调用一次加一。
if (request.hasNextCallSeq()) {
if (rsh == null) {
rsh = scanners.get(scannerName);
}
if (rsh != null) {
if (request.getNextCallSeq() != rsh.nextCallSeq) {
thrownewOutOfOrderScannerNextException("Expected nextCallSeq: " + rsh.nextCallSeq
+ " But the nextCallSeq got from client: " + request.getNextCallSeq() +
"; request=" + TextFormat.shortDebugString(request));
}
// Increment the nextCallSeq value which is the next expected from client.
rsh.nextCallSeq++;
}
}
try {
先从租约管理中移出此租约,防止查找时间大于过期时间而出现的超时
// Remove lease while its being processed in server; protects against case
// where processing of request takes > lease expiration time.
lease = leases.removeLease(scannerName);
生成要返回的条数的一个列表,scan.caching
List<Result> results = newArrayList<Result>(rows);
longcurrentScanResultSize = 0;
booleandone = false;
调用cp的preScannernext,如果返回为true,表示不在执行scan操作。
// Call coprocessor. Get region info from scanner.
if (region != null && region.getCoprocessorHost() != null) {
Boolean bypass = region.getCoprocessorHost().preScannerNext(
scanner, results, rows);
if (!results.isEmpty()) {
for (Result r : results) {
if (maxScannerResultSize < Long.MAX_VALUE){
for (Cellkv : r.rawCells()) {
// TODO
currentScanResultSize += KeyValueUtil.ensureKeyValue(kv).heapSize();
}
}
}
}
if (bypass != null && bypass.booleanValue()) {
done = true;
}
}
执行scan操作。Cp的preScannerNext返回为false,或没有设置cp(主要是RegionObServer)
返回给client的最大size通过hbase.client.scanner.max.result.size配置,默认为long.maxvalue
如果scan也设置有maxResultSize,使用scan设置的值
if (!done) {
longmaxResultSize = scanner.getMaxResultSize();
if (maxResultSize <= 0) {
maxResultSize = maxScannerResultSize;
}
List<Cell> values = newArrayList<Cell>();
MultiVersionConsistencyControl.setThreadReadPoint(scanner.getMvccReadPoint());
region.startRegionOperation(Operation.SCAN);
try {
inti = 0;
synchronized(scanner) {
此处开始迭代,开始调用regionScanner(HRegion.RegionScannerImpl.nextRaw(List))进行查找,
迭代的长度为scan设置的caching的大小,如果执行RegionScanner.nextRaw(List)返回为false,时也会停止迭代
for (; i < rows
&& currentScanResultSize < maxResultSize; i++) {
返回的true表示还有数据,可以接着查询,否则表示此region中已经没有符合条件的数据了。
// Collect values to be returned here
booleanmoreRows = scanner.nextRaw(values);
if (!values.isEmpty()) {
if (maxScannerResultSize < Long.MAX_VALUE){
for (Cellkv : values) {
currentScanResultSize += KeyValueUtil.ensureKeyValue(kv).heapSize();
}
}
results.add(Result.create(values));
}
if (!moreRows) {
break;
}
values.clear();
}
}
region.readRequestsCount.add(i);
} finally {
region.closeRegionOperation();
}
// coprocessor postNext hook
if (region != null && region.getCoprocessorHost() != null) {
region.getCoprocessorHost().postScannerNext(scanner, results, rows, true);
}
}
如果没有可以再查找的数据时,设置response的moreResults为false
// If the scanner's filter - if any - is done with the scan
// and wants to tell the client to stop the scan. This is done by passing
// a null result, and setting moreResults to false.
if (scanner.isFilterDone() && results.isEmpty()) {
moreResults = false;
results = null;
} else {
添加结果到response中,如果hbase.client.rpc.codec配置有codec的值,
默认取hbase.client.default.rpc.codec配置的值,默认为KeyValueCodec
如果上面说的codec配置不为null时,把results生成为一个iterator,并生成一个匿名的CallScanner实现类
设置到scan时传入的controller中。这样能提升查询数据的读取性能。
如果没有配置codec时,默认直接把results列表设置到response中,这样响应的数据可能会比较大。
addResults(builder, results, controller);
}
} finally {
重新把租约放入到租约检查管理器中,此租约主要来检查client多长时间没有发起过scan的操作。
// We're done. On way out re-add the above removed lease.
// Adding resets expiration time on lease.
if (scanners.containsKey(scannerName)) {
if (lease != null) leases.addLease(lease);
ttl = this.scannerLeaseTimeoutPeriod;
}
}
}
client端获取响应的数据:ScannerCallable.call方法中
rrs = ResponseConverter.getResults(cellScanner, response);
ResponseConverter.getResults方法的实现
publicstatic Result[] getResults(CellScannercellScanner, ScanResponse response)
throws IOException {
if (response == null) returnnull;
// If cellscanner, then the number of Results to return is the count of elements in the
// cellsPerResult list. Otherwise, it is how many results are embedded inside the response.
intnoOfResults = cellScanner != null?
response.getCellsPerResultCount(): response.getResultsCount();
Result[] results = new Result[noOfResults];
for (inti = 0; i < noOfResults; i++) {
cellScanner如果codec配置为有值时,在rs响应时会生成一个匿名的实现
if (cellScanner != null) {
......................................
intnoOfCells = response.getCellsPerResult(i);
List<Cell> cells = newArrayList<Cell>(noOfCells);
for (intj = 0; j < noOfCells; j++) {
try {
if (cellScanner.advance() == false) {
.....................................
String msg = "Results sent from server=" + noOfResults + ". But only got " + i
+ " results completely at client. Resetting the scanner to scan again.";
LOG.error(msg);
thrownewDoNotRetryIOException(msg);
}
} catch (IOException ioe) {
...........................................
LOG.error("Exception while reading cells from result."
+ "Resetting the scanner to scan again.", ioe);
thrownewDoNotRetryIOException("Resetting the scanner.", ioe);
}
cells.add(cellScanner.current());
}
results[i] = Result.create(cells);
} else {
否则,没有设置codec,直接从response中读取出来数据,
// Result is pure pb.
results[i] = ProtobufUtil.toResult(response.getResults(i));
}
}
returnresults;
}
在ClientScanner.next方法中,如果还没有达到scan的caching的值,(默认为1)也就是countdown的值还不等于0
,countdown的值为得到一个Result时减1,通过nextScanner重新得到下一个region,并发起连接去scan数据。
Do{
.........................此处省去一些代码。
if (values != null && values.length > 0) {
for (Result rs : values) {
cache.add(rs);
for (Cellkv : rs.rawCells()) {
// TODO make method in Cell or CellUtil
remainingResultSize -= KeyValueUtil.ensureKeyValue(kv).heapSize();
}
countdown--;
this.lastResult = rs;
}
}
} while (remainingResultSize > 0 && countdown > 0 && nextScanner(countdown, values == null));
对于这种类型的查询操作,可以使用得到一个ClientScanner后,不执行close操作。
在rs的timeout前每次定期去从rs中拿一定量的数据下来。缓存到ClientScanner的cache中。
每次next时从cache中直接拿数据
Hregion.RegionScannerImpl.nextRaw(list)方法分析
RegionScannerImpl是对RegionScanner接口的实现。
Rs的scan在执行时通过regionScanner.nextRaw(list)来获取数据。
通过regionScanner.isFilterDone来检查此region的查找是否完成。
调用nextRaw方法,此方法调用另一个重载方法,batch是scan中设置的每次可查询最大的单行中的多少个kv的kv个数
publicbooleannextRaw(List<Cell> outResults)
throws IOException {
returnnextRaw(outResults, batch);
}
publicbooleannextRaw(List<Cell> outResults, intlimit) throws IOException {
booleanreturnResult;
调用nextInternal方法。
if (outResults.isEmpty()) {
// Usually outResults is empty. This is true when next is called
// to handle scan or get operation.
returnResult = nextInternal(outResults, limit);
} else {
List<Cell> tmpList = newArrayList<Cell>();
returnResult = nextInternal(tmpList, limit);
outResults.addAll(tmpList);
}
调用filter.reset方法,清空当前row的filter的相关信息。
ResetFilters();
如果filter.filterAllRemaining()的返回值为true,时表示当前region的查找条件已经结束,不能在执行查找操作。
没有可以接着查找的需要,也就是没有更多要查找的行了。
if (isFilterDone()) {
returnfalse;
}
................................此处省去一些代码
returnreturnResult;
}
nextInternal方法处理流程:
privatebooleannextInternal(List<Cell> results, intlimit)
throws IOException {
if (!results.isEmpty()) {
thrownewIllegalArgumentException("First parameter should be an empty list");
}
RpcCallContextrpcCall = RpcServer.getCurrentCall();
// The loop here is used only when at some point during the next we determine
// that due to effects of filters or otherwise, we have an empty row in the result.
// Then we loop and try again. Otherwise, we must get out on the first iteration via return,
// "true" if there's more data to read, "false" if there isn't (storeHeap is at a stop row,
// and joinedHeap has no more data to read for the last row (if set, joinedContinuationRow).
while (true) {
if (rpcCall != null) {
// If a user specifies a too-restrictive or too-slow scanner, the
// client might time out and disconnect while the server side
// is still processing the request. We should abort aggressively
// in that case.
longafterTime = rpcCall.disconnectSince();
if (afterTime >= 0) {
thrownewCallerDisconnectedException(
"Aborting on region " + getRegionNameAsString() + ", call " +
this + " after " + afterTime + " ms, since " +
"caller disconnected");
}
}
得到通过startkey seek后当前最小的一个kv。
// Let's see what we have in the storeHeap.
KeyValue current = this.storeHeap.peek();
byte[] currentRow = null;
intoffset = 0;
shortlength = 0;
if (current != null) {
currentRow = current.getBuffer();
offset = current.getRowOffset();
length = current.getRowLength();
}
检查是否到了stopkey,如果是,返回false,joinedContinuationRow是多个cf的关联查找,不用去管它
booleanstopRow = isStopRow(currentRow, offset, length);
// Check if we were getting data from the joinedHeap and hit the limit.
// If not, then it's main path - getting results from storeHeap.
if (joinedContinuationRow == null) {
// First, check if we are at a stop row. If so, there are no more results.
if (stopRow) {
如果是stopRow,同时filter.hasFilterRow返回为true时,
可通过filterRowCells来检查要返回的kvlist,也可以用来修改要返回的kvlist
if (filter != null && filter.hasFilterRow()) {
filter.filterRowCells(results);
}
returnfalse;
}
通过filter.filterRowkey来过滤检查key是否需要排除,如果是排除返回true,否则返回false
// Check if rowkey filter wants to exclude this row. If so, loop to next.
// Technically, if we hit limits before on this row, we don't need this call.
if (filterRowKey(currentRow, offset, length)) {
如果rowkey是需要排除的rowkey,检查是否有下一行数据。如果没有下一行数据,返回flase,表示当前region查找结束
否则清空当前的results,重新进行查找
booleanmoreRows = nextRow(currentRow, offset, length);
if (!moreRows) returnfalse;
results.clear();
continue;
}
开始执行region下此scan需要的所有store的StoreScanner的next进行查找,把查找的结果放到results列表中。
如果一行中包含有多个kv,现在查找这些kv达到传入的limit的大小的时候,返回kv_limit的一个空的kv。
(查找的大小已经达到limit(batch)的一行最大scan的kv个数,返回kv_limit),
否则表示还没有查找到limit的kv个数,但是当前row对应的所有达到条件的kv都已经查找完成,返回最后一个kv。
返回的kv如果不是kv_limit,那么有可能是null或者是下一行的第一个kv.
KeyValue nextKv = populateResult(results, this.storeHeap, limit, currentRow, offset,
length);
如果达到limit的限制时,filter.hasFilterRow的值一定得是false,
否则会throw IncompatibleFilterException
如果达到limit的限制时,返回true,当前row的所有kv查找结束,返回true可以接着向下查找
提示:如果hbase一行数据中可能包含多个kv时,最好是在scan时设置batch的属性,否则会一直查找到所有的kv结束
// Ok, we are good, let's try to get some results from the main heap.
if (nextKv == KV_LIMIT) {
if (this.filter != null && filter.hasFilterRow()) {
thrownewIncompatibleFilterException(
"Filter whose hasFilterRow() returns true is incompatible with scan with limit!");
}
returntrue; // We hit the limit.
}
是否到结束行,从这一行代码中可以看出,stoprow是不包含的,因为nextKv肯定是下一行row中第一个kv的值
stopRow = nextKv == null ||
isStopRow(nextKv.getBuffer(), nextKv.getRowOffset(), nextKv.getRowLength());
// save that the row was empty before filters applied to it.
finalbooleanisEmptyRow = results.isEmpty();
如果是stopRow,同时filter.hasFilterRow返回为true时,
可通过filterRowCells来检查要返回的kvlist,也可以用来修改要返回的kvlist
// We have the part of the row necessary for filtering (all of it, usually).
// First filter with the filterRow(List).
if (filter != null && filter.hasFilterRow()) {
filter.filterRowCells(results);
}
如果当前row的查找没有找到合法的kv,也就是results的列表没有值,检查是否还有下一行,
如果有,重新进行查找,否则表示当前region的查找最结尾处,不能再进行查找,返回fasle
if (isEmptyRow) {
booleanmoreRows = nextRow(currentRow, offset, length);
if (!moreRows) returnfalse;
results.clear();
// This row was totally filtered out, if this is NOT the last row,
// we should continue on. Otherwise, nothing else to do.
if (!stopRow) continue;
returnfalse;
}
// Ok, we are done with storeHeap for this row.
// Now we may need to fetch additional, non-essential data into row.
// These values are not needed for filter to work, so we postpone their
// fetch to (possibly) reduce amount of data loads from disk.
if (this.joinedHeap != null) {
..................................进行关联查找的代码,不显示,也不分析
}
} else {
多个store进行关联查询,不分析,通常情况不会有
// Populating from the joined heap was stopped by limits, populate some more.
populateFromJoinedHeap(results, limit);
}
// We may have just called populateFromJoinedMap and hit the limits. If that is
// the case, we need to call it again on the next next() invocation.
if (joinedContinuationRow != null) {
returntrue;
}
如果这次的查找,results的结果为空,表示没有查找到结果,检查是否还有下一行数据,如果有重新进行查找,
否则返回false表示此region的查找结束
// Finally, we are done with both joinedHeap and storeHeap.
// Double check to prevent empty rows from appearing in result. It could be
// the case when SingleColumnValueExcludeFilter is used.
if (results.isEmpty()) {
booleanmoreRows = nextRow(currentRow, offset, length);
if (!moreRows) returnfalse;
if (!stopRow) continue;
}
非stoprow时,表示还可以有下一行的数据,也就是可以接着进行next操作。否则表示此region的查找结束
// We are done. Return the result.
return !stopRow;
}
}
UserScan时的ScanQueryMatcher.match方法处理
user scan时的ScanQueryMatcher在newRegionScannerImpl(scan, additionalScanners, this);时生成。
在生成StoreScanner时通过如下代码生成matcher实例。
matcher = newScanQueryMatcher(scan, scanInfo, columns,
ScanType.USER_SCAN, Long.MAX_VALUE, HConstants.LATEST_TIMESTAMP,
oldestUnexpiredTS);
matcher.isUserScan的值此时为true.
publicMatchCodematch(KeyValue kv) throws IOException {
检查当前region的查找是否结束。pageFilter就是通过控制此filter中的方法来检查是否需要
if (filter != null && filter.filterAllRemaining()) {
returnMatchCode.DONE_SCAN;
}
byte [] bytes = kv.getBuffer();
intoffset = kv.getOffset();
intkeyLength = Bytes.toInt(bytes, offset, Bytes.SIZEOF_INT);
offset += KeyValue.ROW_OFFSET;
intinitialOffset = offset;
shortrowLength = Bytes.toShort(bytes, offset, Bytes.SIZEOF_SHORT);
offset += Bytes.SIZEOF_SHORT;
检查传入的kv是否是当前行的kv,也就是rowkey是否相同,如果当前的rowkey小于传入的rowkey,
表示现在已经next到下一行了,返回DONE,表示当前行查找结束
intret = this.rowComparator.compareRows(row, this.rowOffset, this.rowLength,
bytes, offset, rowLength);
if (ret <= -1) {
returnMatchCode.DONE;
} elseif (ret >= 1) {
如果当前的rowkey大于传入的rowkey,表示当前next出来的kv比现在的kv要小,执行nextrow操作。
// could optimize this, if necessary?
// Could also be called SEEK_TO_CURRENT_ROW, but this
// should be rare/never happens.
returnMatchCode.SEEK_NEXT_ROW;
}
是否跳过当前行的其它kv比较,这是一个优化项。
// optimize case.
if (this.stickyNextRow)
returnMatchCode.SEEK_NEXT_ROW;
如果当前行的所有要查找的(scan)column都查找完成了,其它的当前行中非要scan的kv,
直接不比较,执行nextrow操作。
if (this.columns.done()) {
stickyNextRow = true;
returnMatchCode.SEEK_NEXT_ROW;
}
//Passing rowLength
offset += rowLength;
//Skipping family
bytefamilyLength = bytes [offset];
offset += familyLength + 1;
intqualLength = keyLength -
(offset - initialOffset) - KeyValue.TIMESTAMP_TYPE_SIZE;
检查当前KV的TTL是否过期,如果过期,检查是否SCAN中还有下一个COLUMN,如果有返回SEEK_NEXT_COL。
否则返回SEEK_NEXT_ROW。
longtimestamp = Bytes.toLong(bytes, initialOffset + keyLength - KeyValue.TIMESTAMP_TYPE_SIZE);
// check for early out based on timestamp alone
if (columns.isDone(timestamp)) {
returncolumns.getNextRowOrNextColumn(bytes, offset, qualLength);
}
/*
* The delete logic is pretty complicated now.
* This is corroborated by the following:
* 1. The store might be instructed to keep deleted rows around.
* 2. A scan can optionally see past a delete marker now.
* 3. If deleted rows are kept, we have to find out when we can
* remove the delete markers.
* 4. Family delete markers are always first (regardless of their TS)
* 5. Delete markers should not be counted as version
* 6. Delete markers affect puts of the *same* TS
* 7. Delete marker need to be version counted together with puts
* they affect
*/
bytetype = bytes[initialOffset + keyLength – 1];
如果当前KV是删除的KV。
if (kv.isDelete()) {
此处会进入。把删除的KV添加到DeleteTracker中,默认是ScanDeleteTracker
if (!keepDeletedCells) {
// first ignore delete markers if the scanner can do so, and the
// range does not include the marker
//
// during flushes and compactions also ignore delete markers newer
// than the readpoint of any open scanner, this prevents deleted
// rows that could still be seen by a scanner from being collected
booleanincludeDeleteMarker = seePastDeleteMarkers ?
tr.withinTimeRange(timestamp) :
tr.withinOrAfterTimeRange(timestamp);
if (includeDeleteMarker
&& kv.getMvccVersion() <= maxReadPointToTrackVersions) {
this.deletes.add(bytes, offset, qualLength, timestamp, type);
}
// Can't early out now, because DelFam come before any other keys
}
此处的检查不会进入,userscan不保留删除的数据
if (retainDeletesInOutput
|| (!isUserScan && (EnvironmentEdgeManager.currentTimeMillis() - timestamp) <= timeToPurgeDeletes)
|| kv.getMvccVersion() > maxReadPointToTrackVersions) {
// always include or it is not time yet to check whether it is OK
// to purge deltes or not
if (!isUserScan) {
// if this is not a user scan (compaction), we can filter this deletemarker right here
// otherwise (i.e. a "raw" scan) we fall through to normal version and timerange checking
returnMatchCode.INCLUDE;
}
} elseif (keepDeletedCells) {
if (timestamp < earliestPutTs) {
// keeping delete rows, but there are no puts older than
// this delete in the store files.
returncolumns.getNextRowOrNextColumn(bytes, offset, qualLength);
}
// else: fall through and do version counting on the
// delete markers
} else {
returnMatchCode.SKIP;
}
// note the following next else if...
// delete marker are not subject to other delete markers
} elseif (!this.deletes.isEmpty()) {
如果deleteTracker中不为空时,也就是当前行中有删除的KV,检查当前KV是否是删除的KV
提示:删除的KV在compare时,比正常的KV要小,所以在执行next操作时,delete的KV会先被查找出来。
如果是删除的KV,根据KV的删除类型,如果是版本被删除,返回SKIP。
否则如果SCAN中还有下一个要SCAN的column时,返回SEEK_NEXT_COL。
否则表示当前行没有需要在进行查找的KV,返回SEEK_NEXT_ROW。
DeleteResultdeleteResult = deletes.isDeleted(bytes, offset, qualLength,
timestamp);
switch (deleteResult) {
caseFAMILY_DELETED:
caseCOLUMN_DELETED:
returncolumns.getNextRowOrNextColumn(bytes, offset, qualLength);
caseVERSION_DELETED:
caseFAMILY_VERSION_DELETED:
returnMatchCode.SKIP;
caseNOT_DELETED:
break;
default:
thrownewRuntimeException("UNEXPECTED");
}
}
检查KV的时间是否在SCAN要查找的时间范围内,
inttimestampComparison = tr.compare(timestamp);
如果大于SCAN的最大时间,返回SKIP。
if (timestampComparison >= 1) {
returnMatchCode.SKIP;
} elseif (timestampComparison <= -1) {
如果小于SCAN的最小时间,如果SCAN中还有下一个要SCAN的column时,返回SEEK_NEXT_COL。
否则表示当前行没有需要在进行查找的KV,返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes, offset, qualLength);
}
检查当前KV的column是否是SCAN中指定的column列表中包含的值,如果是INCLUDE。
否则如果SCAN中还有下一个要SCAN的column时,返回SEEK_NEXT_COL。
否则表示当前行没有需要在进行查找的KV,返回SEEK_NEXT_ROW。
// STEP 1: Check if the column is part of the requested columns
MatchCodecolChecker = columns.checkColumn(bytes, offset, qualLength, type);
如果column是SCAN中要查找的column之一
if (colChecker == MatchCode.INCLUDE) {
ReturnCodefilterResponse = ReturnCode.SKIP;
// STEP 2: Yes, the column is part of the requested columns. Check if filter is present
if (filter != null) {
执行filter.filterKeyValue操作。并返回filter过滤的结果
// STEP 3: Filter the key value and return if it filters out
filterResponse = filter.filterKeyValue(kv);
switch (filterResponse) {
caseSKIP:
returnMatchCode.SKIP;
caseNEXT_COL:
如果SCAN中还有下一个要SCAN的column时,返回SEEK_NEXT_COL。
否则表示当前行没有需要在进行查找的KV,返回SEEK_NEXT_ROW。
returncolumns.getNextRowOrNextColumn(bytes, offset, qualLength);
caseNEXT_ROW:
stickyNextRow = true;
returnMatchCode.SEEK_NEXT_ROW;
caseSEEK_NEXT_USING_HINT:
returnMatchCode.SEEK_NEXT_USING_HINT;
default:
//It means it is either include or include and seek next
break;
}
}
/*
* STEP 4: Reaching this step means the column is part of the requested columns and either
* the filter is null or the filter has returned INCLUDE or INCLUDE_AND_NEXT_COL response.
* Now check the number of versions needed. This method call returns SKIP, INCLUDE,
* INCLUDE_AND_SEEK_NEXT_ROW, INCLUDE_AND_SEEK_NEXT_COL.
*
* FilterResponse ColumnChecker Desired behavior
* INCLUDE SKIP row has already been included, SKIP.
* INCLUDE INCLUDE INCLUDE
* INCLUDE INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE INCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
* INCLUDE_AND_SEEK_NEXT_COL SKIP row has already been included, SKIP.
* INCLUDE_AND_SEEK_NEXT_COL INCLUDE INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_COL
* INCLUDE_AND_SEEK_NEXT_COL INCLUDE_AND_SEEK_NEXT_ROW INCLUDE_AND_SEEK_NEXT_ROW
*
* In all the above scenarios, we return the column checker return value except for
* FilterResponse (INCLUDE_AND_SEEK_NEXT_COL) and ColumnChecker(INCLUDE)
*/
此处主要是检查KV的是否是SCAN的最大版本内,到这个地方,除非是KV超过了要SCAN的最大版本,或者KV的TTL过期。
否则肯定是会包含此KV的值。
colChecker =
columns.checkVersions(bytes, offset, qualLength, timestamp, type,
kv.getMvccVersion() > maxReadPointToTrackVersions);
//Optimize with stickyNextRow
stickyNextRow = colChecker == MatchCode.INCLUDE_AND_SEEK_NEXT_ROW ? true : stickyNextRow;
return (filterResponse == ReturnCode.INCLUDE_AND_NEXT_COL &&
colChecker == MatchCode.INCLUDE) ? MatchCode.INCLUDE_AND_SEEK_NEXT_COL
: colChecker;
}
stickyNextRow = (colChecker == MatchCode.SEEK_NEXT_ROW) ? true
: stickyNextRow;
returncolChecker;
}
相关推荐
内容概要:本文提供了详细的MongoDB分片集群的搭建指导,涵盖了从环境准备、配置文件编写、副本集的建立、主节点的选择、配置服务器和数据分片服务器的配置到最后的路由节点的搭建与操作整个流程,以及对数据库的哈希与范围两种分片策略的应用介绍和具体命令执行。 适合人群:熟悉NoSQL数据库概念并对MongoDB有一定了解的技术人员,尤其是在大型数据管理和分布式数据库架构设计中有需求的开发者。 使用场景及目标:帮助技术人员掌握构建高效能、高可用性的MongoDB分片集群的方法,适用于处理大规模、实时性强的数据存储与读取场景。 其他说明:文中通过实例演示了每个步骤的具体操作方法,便于跟随文档实操,同时也介绍了可能遇到的问题及其解决方案,如在没有正确配置的情况下试图写入数据时出现错误等情况的处理。
CPPC++_嵌入式硬件的物联网解决方案blinker库与Arduino ESP8266 ESP32一起工作
CPPC++_逆向调用QQ Mojo IPC与WeChat XPlugin
CPPC++_现代活动指标
CPPC++_Xournal是一款手写笔记软件,支持PDF注释,使用C语言编写,支持GTK3,支持Linux,如Ubu
资源概述: 本资源提供了一套完整的学生实习管理系统解决方案,涵盖了前台小程序页面与后台管理系统两大模块。前台小程序页面设计简洁直观,用户可根据不同身份(学生或企业)进行登录。学生用户能够方便地浏览并投递感兴趣的实习岗位,而企业用户则能轻松发布实习信息,吸引优秀人才。后台管理系统功能全面,包括个人中心、首页、学生管理、教师管理、企业管理、招聘管理、评分管理以及实习管理等多个方面,为管理员提供了强大的数据管理和操作工具。 技术栈亮点: SSM框架:系统后台采用Spring、Spring MVC和MyBatis Plus(简称SSM)作为核心开发框架,确保了系统的稳定性、可扩展性和可维护性。Spring作为控制反转(IoC)和面向切面编程(AOP)的容器,为系统提供了强大的业务逻辑处理能力;Spring MVC则负责处理Web请求和响应,实现了前后端的分离;MyBatis Plus作为持久层框架,简化了数据库操作,提高了开发效率。 MySQL数据库:系统采用MySQL作为数据库存储解决方案,支持大数据量的存储和高效查询。 如有侵权请联系我删除,谢谢
微服务闪聚支付项目
博客链接 https://blog.csdn.net/weixin_47560078/article/details/143714557 文章从原理介绍出发,实现了 Rust 与 Java 的互调。利用 JNI 技术,可以充分发挥 Rust 的性能优势,同时保持 Java 的跨平台特性。这种技术组合适用于对性能要求较高的应用场景,如图像处理、数据分析和系统级编程等。
cppc++
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。 替换数据可以直接使用,注释清楚,适合新手
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。 替换数据可以直接使用,注释清楚,适合新手
分布式事务lcn
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。
cppc++
安卓手机与电脑的socket通信源码
Anaconda:JupyterNotebook使用教程.docx
Amazon S3:S3静态网站托管教程.docx
Python商品销售数据分析可视化项目源码(期末大作业).zip,个人经导师指导并认可通过的98分大作业设计项目。主要针对计算机相关专业的正在做期末大作业设计的学生和需要项目实战练习的学习者,可作为课程设计、期末大作业,代码资料完整下载可用。 Python商品销售数据分析可视化项目源码(期末大作业).zip,个人经导师指导并认可通过的98分大作业设计项目。主要针对计算机相关专业的正在做期末大作业设计的学生和需要项目实战练习的学习者,可作为课程设计、期末大作业,代码资料完整下载可用。Python商品销售数据分析可视化项目源码(期末大作业).zip,个人经导师指导并认可通过的98分大作业设计项目。主要针对计算机相关专业的正在做期末大作业设计的学生和需要项目实战练习的学习者,可作为课程设计、期末大作业,代码资料完整下载可用。Python商品销售数据分析可视化项目源码(期末大作业).zip,个人经导师指导并认可通过的98分大作业设计项目。主要针对计算机相关专业的正在做期末大作业设计的学生和需要项目实战练习的学习者,可作为课程设计、期末大作业,代码资料完整下载可用。Python商品销售数据分析
CPPC++_wechathookWeChatApi微信Api微信hook微信接口python微信接口java微信Ap